Rather than looking at this page, you would probably be better advised to consult the XPL/I Page if you're interested in XPL, or the HAL/S page if you're interested in HAL/S. This page is a less-readable, less-organized jumble of the same material appearing on those other two pages. This page will not receive further updates. It has been kept, temporarily, just in case I accidentally omitted useful material from those other two pages, but is ultimately likely to be removed.

Introduction to XPL/I
Processing a Program Written in XPL, Using XCOM-I

A Primer for Standard XPL and Intermetrics XPL/I
Debugging XPL Programs

Patches for Insertion of Inline Code
IBM 360 Inline Code

Introduction to XPL/I

XPL used to be a computer language (see the Wikipedia article) in the 1960's and 1970's that was defined by a book called A Compiler Generator, by William M. McKeeman, James J. Horning, and David B. Wortman. Today, the language is deader than ancient Sumerian. The internals of the language relied heavily on the fact that the compiled XPL programs would run on IBM System/360 mainframes. Indeed, the preface to the book explicitly states that they were creating a language specifically for the IBM 360. While you can write an XPL compiler that will run on a different type of computer system, or that would cross-compile program programs that will run on a different kind of computer system, doing so in a way that such programs, as-is, would produce the same results as they would have on an IBM 360 is problematic. Perhaps worse, the book remains the sole documentation for the standard XPL language, and is available neither for free, nor in digital form online. For brevity, I'll refer to that book from now on simply as McKeeman.

Keep in mind that the Virtual AGC Project is devoted to onboard guidance software used aboard space vehicles such as those of the Apollo and Space Shuttle projects, thus most of what I have to say on the topic of XPL is from that point of view. But that doesn't mean it isn't applicable more generally as well. For the purpose of thinking about XPL, what you need to know to follow the discussion herein is this:

Flight software for the Space Shuttle was primarily written in a computer language called HAL/S, created by Intermetrics, Inc. That flight software was called PASS, and that's what I'll refer to it as from now on.
To compile PASS source code, you therefore need a compiler for HAL/S, and more-specifically for Intermetrics's specific variation of HAL/S. Intermetrics's HAL/S compiler, called HAL/S-FC, was primarily written in a language that the Intermetrics team called "XPL".
Thus to compile the original compiler for PASS, you first need a compiler for what Intermetric referred to as XPL.

Well, that's certainly a mouthful of words! On takeaway is that having an XPL compiler is just one item — though an important one! — in a chain of things that are helpful in the process of resurrecting PASS for modern audiences.

Alas, it's even a bit more complicated than what I just said. For one thing, as you may have gathered from my wording above, Intermetrics did not write its HAL/S compiler in standard XPL as defined by McKeeman. Rather, they extended the language with new features, occasionally changing the existing features in an incompatible way. From now on, I'll refer to Intermetrics's variant of XPL as XPL/I to distinguish it from the standard.

Aside: And in case you're wondering, there is no known surviving copy of Intermetrics's own XPL/I compiler. In fact, there is reason to believe that it was intentionally destroyed. (Prove me wrong, please!) If it had survived, however, it would presumably have been written in standard XPL and itself compiled by McKeeman's XPL compiler (which was called XCOM) or some later variant thereof.

Even if you actually had a perfect compiler for standard XPL that you could use on (say) a Windows, Mac OS, or Linux computer, it would most likely neither compile most XPL/I programs, nor would most of those compiled programs run correctly afterwards if you were able to do so. But frankly, even having such a hypothetical perfect compiler for standard XPL is somewhat of a trick, due to the dependence on the underlying IBM 360 that's non-existent in Windows, Mac OS, and Linux.

Consider the following brief example of a program:

declare x fixed, y fixed, z fixed;
x(0) = 1;
x(1) = 2;
x(2) = 3;
output = x(0) || ' ' || x(1) || ' ' || x(2);
output = x || ' ' || y || ' ' || z;

What is this little example supposed to do? First, it declares 3 variables of the fixed datatype; fixed means that x, y, and z are 32-bit signed integers. It then proceeds to use x as if it were a three-element array, though x wasn't declared that way, assigning values to each of its elements. The program simply assumes that x(0) corresponds to x, x(1) corresponds to y, and x(2) corresponds to z. Finally, the program "prints out" both the elements of x (as an array) and the values of x (as a non-array), y, and z. Naturally, those two lines of printout are identical. In case you're wondering, we could have used z(-1) as an equivalent to y, or z(-2) as x. For that matter, we could have used this feature to read or write areas of memory in which no variables at all had been declared. Or to a certain extent, if we had declared variables of some other datatype than FIXED, we could have used this same trick to access them as if they were FIXED.

I call this particular "feature" of XPL subscripting sloppiness. It's one of the many features of XPL that is simply undocumented and which you are left to deduce by examining actual XPL code or by trying to figure out why your apparently-perfectly-formed XPL code produces entirely unexpected results when executed.

Aside: I say that subcripting sloppiness is "undocumented", but you could find passing hints to it in McKeenan if you already knew in advance about this feature by other means. True, McKeeman does say that array bounds are respected, which is false, but it does also say on p. 137 of McKeeman: "Assignments to subscripted variables are not checked against the array bounds; thus every memory location is accessible through subscripting." And if you pore through every line of XPL source code in the book, you'll certainly come across examples of subscripting sloppiness. Is that "documentation" of such a major feature? You be the judge.

Aside: Daniel Weaver has written an XPL-to-C translator. The first thing anyone suggests to me when this topic is discussed is "Why don't you just use it?" The subscripting sloppiness inherent in actual usage of XPL is one good reason. As you might expect, since there's no reasonable way to deduce from the documentation that subscripting sloppiness exists, there's reason to doubt that an XPL compiler relying on the accuracy of McKeeman might not support it. Here's a fun printout of the very-sensible error messages you get by running the sample code above through Dan's translator :

XPL to C language translator -- version 1.1
2    |x(0) = 1;                                                                       |
           |
*** Error, Identifier is not an array (detected at line 5741 in xcom). ***
3    |x(1) = 2;                                                                       |
           |
*** Error, Identifier is not an array (detected at line 5741 in xcom). ***
*** Last previous error was detected on line 2. ***
4    |x(2) = 3;                                                                       |
           |
*** Error, Identifier is not an array (detected at line 5741 in xcom). ***
*** Last previous error was detected on line 3. ***
5    |output = x(0) || ' ' || x(1) || ' ' || x(2);                                    |
                    |
*** Error, Identifier is not an array (detected at line 5741 in xcom). ***
*** Last previous error was detected on line 4. ***
5    |output = x(0) || ' ' || x(1) || ' ' || x(2);                                    |
                                   |
*** Error, Identifier is not an array (detected at line 5741 in xcom). ***
*** Last previous error was detected on line 5. ***
5    |output = x(0) || ' ' || x(1) || ' ' || x(2);                                    |
                                                 |
*** Error, Identifier is not an array (detected at line 5741 in xcom). ***
*** Last previous error was detected on line 5. ***
6 cards containing 7 statements were compiled.
6 errors (0 severe) were detected.
The last detected error was on line 5.

Aside: Daniel has also pointed out that aside from his own compiler, Dave Bodenstab wrote an XPL compiler for FreeBSD, which Daniel himself has ported to Linux. I have not tried it as of this writing.

Aside: The other suggestion I receive with a fair degree of regularity is, "Why not just run all of this IBM 360 software in a simulator like Hercules, and be done with it? Problem solved!" Well, for one thing, we don't have Intermetrics's XPL/I compiler, so we can't run it on an IBM 360 simulation. Lacking that, we can't build Intermetrics's HAL/S compiler in such a way that it could be run on an IBM 360 simulation. But if we had the HAL/S compiler and it could compile the PASS flight software to IBM 360 object code, why don't we just run that on an IBM 360 simulator? Well ... tell me how to integrate Hercules into a spacecraft simulator like Orbiter? And tell me how to motivate any of you to learn how to use Hercules in addition to the already very steep learning curve you face with the flight software? If you can answer those questions for me, I'll admit you've got a pretty good idea! With that said, there's definitely merit in emulating an IBM 360, if not necessarily emulating a full IBM 360 system. I'll come back to this point later

From my whining above about the quality (from a user's point of view) of the documentation of the XPL language, you'll probably recognize that it's hard to make a full list of all the difficulties of XPL or all of the differences between XPL and XPL/I. And just as McKeeman is not a full and accurate description of XPL, neither is the surviving Intermetrics documentation a full description of XPL/I. And confusingly for the present discussion, Intermetrics did not use an explicit term (like "XPL/I") to refer to their language. Instead they continued to refer to it simply as "XPL", making it hard to be entirely sure what's supposed to be different between the two languages and what's not.

Taking all of this stuff into consideration, my conclusion is that if you want to compile Intermetrics's HAL/S compiler, you're not going to be able to rely on any existing XPL compiler. You're going to have to rely on a newly-written XPL compiler having support for the true (as opposed to documented) properties of XPL and XPL/I baked into it from the ground up. I call this new XPL+XPL/I compiler XCOM-I. (XCOM was the name of McKeeman's original XPL compiler, and I suspect of Intermetrics's compiler as well, while the "I" is for "Intermetrics".)

Like Daniel Weaver's XPL translator, XCOM-I is actually an XPL/I-to-C translator, though it is entirely new and is not based on Daniel's work nor on any of the original XPL compilers. In most cases, XCOM-I should work for standard XPL programs too.

Aside: In case you're wondering, XCOM-I does translate the sample XPL/I program given above to C without error; the C program it creates also compiles without errors; and if you run the compiled C program you get what you might expect:
PAGE 1


1 2 3
1 2 3

There's one final complication: In legacy XPL or XPL/I source code, it was typical to intersperse the XPL or XPL/I (thankfully, sparingly!) with code written in IBM System/360 Basic Assembly Language (BAL). Several sections below discuss this topic.

Processing a Program Written in XPL, Using XCOM-I

Despite the title of this section, there's actually little difference between how to build and run XPL/I programs versus XPL programs. Once you know how to build and run an XPL program, you'll automatically know how to build and run an XPL/I program ... mostly. There's an extra command-line switch you can use to specify that you really, truly want XPL rather than XPL/I, namely --xpl, but it usually doesn't matter much if you leave it out.

With that said, the XPL/I source code available to us dwarfs the XPL source code we have for any standard XPL programs we have. Moreover, those XPL/I programs are much larger and more complex than the XPL programs. This means that there are differences in how the source code for the two is organized and maintained. In that sense we find differences in how to deal with the two. That's why I concentrate at first on compiling standard XPL programs, and defer some discussion of XPL/I compilation until later.

Installation of XCOM-I

Or more precisely, "installation of XCOM-I and other stuff needed to follow the instructions given below".

Linux or Mac

First, get the Virtual AGC source tree from our repository. For the sake of discussion only, I'm going to assume that this ends up as a folder called virtualagc, but you need to substitute the actual name in the instructions below.
Add the folders virtualagc/XCOM-I and virtualagc/yaShuttle/ported/PASS1.PROCS to your PATH. I'm hoping you know how to do that on your system of choice.
Install extra Python3 modules: pip3 install ebcdic, or possibly pip install ebcdic.
On Linux, install clang-format from your package repository. I believe it's already installed if you have clang on the Mac, which I think all current Macs do.

Having done this, I find that everything described elsewhere on this page works identically on my up-to-date Linux Mint system as on my mid-2007 Mac OS 10.7.5 system, albeit much more slowly on the latter. Admittedly, I fire up the Mac only to check stuff like this — that is to say, very seldom — so experience on the Mac may vary on a day-to-day basis.

Windows

The situation for Windows is less palatable. Everyone just wants to use Visual Studio, but that's not going to work without irritating modifications. There are potential easy solutions, such as WSL, which I'm unfortunately not in a position to try, and many candidates for more-native solutions, which I had no luck with. Of course, running a Linux virtual machine on your Windows system is free and easy, so there's that! Short of running a Linux virtual machine, though, a somewhat more light-weight solution I used on Windows 11 Pro is this:

Install Msys2, which gives you a Linux/Mac-like command-line working environment for compiling stuff.
Start up Msys2 from the Windows start menu and do all of the remaining installation steps, as well as the various other instructions I give you elsewhere on this page, within that Msys2 command-line environment.
Install various packages: pacman -S git diffutils gcc make python3 python3-pip vim.
Install extra Python3 module(s): pip3 install ebcdic.
Get the Virtual AGC source tree with this command: git clone --depth=1 https://github.com/virtualagc/virtualagc.git.
To add folders to your PATH, you'll need to edit the file .bashrc (notice the leading dot) in your Msys2 home directory. If you're familiar with vim, which you just now installed, you can do it with the command vim ~/.bashrc; if not, you can use another text editor that you're more familiar with, but outside of the Msys2 environment you'll have to know where to find .bashrc to edit it. With the default options in the Msys2 installation, I found it at c:\msys64\home\username\.bashrc and edited it with notepad. What you do is insert the line I've highlighted in green:

.
.
.
# User dependent .bashrc file

# If not running interactively, don't do anything
[[ "$-" != *i* ]] && return

PATH=$PATH:~/virtualagc/XCOM-I:~/virtualagc/yaShuttle/ported/PASS1.PROCS

# Shell Options
.
.
.

After saving that file, close your Msys2 command line, and then run Msys2 again to put the new PATH into effect.

This installation seems to function quite well, other than a very slight loss of functionality in Windows (vs Linux/Mac):

Optional backtraces are no longer available in the XCOM-I runtime library.
The XCOM-I command-line option --pp for "pretty-printing" the C code output by XCOM-I no longer works.

Aside: Actually, you can make --pp work, but I didn't suggest it above because it's such massive overkill. The --pp option relies on the program clang-format, which you can get by installing the entire clang/llvm compiler infrastructure. Even I wouldn't bother. On the other hand, if you do, you'd also have the clang compiler to choose from as an alternative to gcc. Note that you'll also have to add it to the PATH, as in step #6 above. Instead of the line shown in step #6, insert this line into .bashrc:
PATH=$PATH:~/virtualagc/XCOM-I:~/virtualagc/yaShuttle/ported/PASS1.PROCS:/c/"Program Files"/LLVM/bin

Testing the Installation

Following the instructions on the remainder of this page essentially tests the installation. However, in brief, what I do to verify operation is:

cd virtualagc/XCOM-I
make -f Makefile-regression && make
cd ../yaShuttle/"Source Code"/PASS.REL32V0
make -s
make -s regression

Remember to use the path to where you installed the Virtual AGC source tree, in place of virtualagc.

Messages will be printed that give you a pretty clear idea as to success or failure. What's tested is:

The action of XCOM-I.
The action of some compilers originally written in XPL but now compiled by XCOM-I: in particular, XPL compilers XCOM3 and XCOM4, and HAL/S compiler HAL/S-FC.
The action of those compilers, in compiling additional XPL or HAL/S programs.
The action of the IBM 360 emulator, sim360.

Compiling and Running XPL Programs

There are several standard XPL programs present in the folder XCOM-I/Tests/. It would be a mistake to think any of them are user-friendly. But they do illustrate the principles involved.

Aside: You'd also be very mistaken to imagine that XCOM-I itself is user-friendly. I fear that there will be a lot of work involved — work which I may not even have the energy to perform — before the error messages XCOM-I spits out upon occasion can be mistaken for anything other than mere spit. Until then, try not to make any errors in your XPL programming.

Aside: Throughout this discussion, I use the Linux/Mac convention that the symbol '/' is used to separate the components of a filename and the path to the folder containing it. Windows uses the separator '\' instead, so in some places you may find that you need to replace '/' by '\'.

For the sake of discussion, suppose we wish to compile and run the sample program called Example-6.18.6.xpl. The first step is to use XCOM-I to translate Example-6.18.6.xpl into C source code:

cd XCOM-I
XCOM-I.py Tests/Example-6.18.6.xpl

This will create a sub-folder called Example-6.18.6/ of the current working folder, and put all of the files of the C translation into that folder.

Aside: XCOM-I.py has various allowable command-line options, though none were needed in the invocation shown above. You can see a list of XCOM-I.py's command-line options with the command "XCOM-I.py --help".

I won't bore you with a lengthy description of the C files output by XCOM-I, but it might be mildly instructive to glance briefly at a couple of them. For this example, the principal outputs are the files RANDOM.c and main.c, and here you can see a comparison of the original XPL file (left) to the C translation of them (right):

**XPL Source Code versus Generated C Code**
/* This is example XPL program 6.18.6 from McKeeman p. 157. The book only provides PROCEDURE RANDOM, which is transcribed as-is. The top-level code that exercises RANDOM is new. / RANDOM: procedure(range) fixed; / Returns a random integer in the range 0 to range - 1 / declare range fixed, rbase fixed initial(1), rmult literally '671297325'; rbase = rbase rmult; return shr(shr(rbase, 16) * range, 16); end RANDOM; declare i; do i = 1 to 100; output = RANDOM(100000); end; eof	/* File RANDOM.c generated by XCOM-I, 2024-04-16 08:46:47. / #include "runtimeC.h" #include "procedures.h" int32_t RANDOM(void) { // rbase = rbase rmult; (2) { int32_t numberRHS = xmultiply(getFIXED(8), 671297325); putFIXED(8, numberRHS); } // return shr(shr(rbase, 16) * range, 16); (3) return SHR(xmultiply(SHR(getFIXED(8), 16), getFIXED(4)), 16); } /* File main.c generated by XCOM-I, 2024-04-16 08:46:47. XPL/I source-code file used: Example-6.18.6.xpl. To build the program from the command line, using defaults: cd Example-6.18.6/ make View the Makefile to see different options for the `make` command above. To run the program: Example-6.18.6 [OPTIONS] Use `Example-6.18.6 --help` to see the available OPTIONS. / #include "runtimeC.h" #include "procedures.h" / Memory Map: Address (Hex) Data Type Variable ------------- --------- -------- 0 (000000) FIXED I 4 (000004) FIXED RANDOMxRANGE 8 (000008) FIXED RANDOMxRBASE / int main(int argc, char argv[]) { if (parseCommandLine(argc, argv)) exit(0); // do i = 1 to 100; (0) { int32_t from0, to0, by0; from0 = 1; to0 = 100; by0 = 1; for (putFIXED(0, from0); getFIXED(0) <= to0; putFIXED(0, getFIXED(0) + by0)) { // output = RANDOM(100000); (1) { int32_t numberRHS = ( putFIXED(4, 100000), RANDOM() ); string_t stringRHS; strcpy(stringRHS, fixedToCharacter(numberRHS)); OUTPUT(0, stringRHS); } } } // End of DO for-loop block if (LINE_COUNT) printf("\n"); // Flush buffer for OUTPUT(0) and OUTPUT(1). return 0; // Just in case ... }

XPL Source Code versus Generated C Code

/* This is example XPL program 6.18.6 from McKeeman p. 157.
   The book only provides PROCEDURE RANDOM, which is transcribed as-is.
   The top-level code that exercises RANDOM is new. */
   
RANDOM:
  procedure(range) fixed;
    /*  Returns a random integer in the range 0 to range - 1  */
    
    declare range fixed, rbase fixed initial(1),
      rmult literally '671297325';
      
    rbase = rbase * rmult;
    
    return shr(shr(rbase, 16) * range, 16);
    
  end RANDOM;

declare i;

do i = 1 to 100;
  output = RANDOM(100000);
end;

eof

/*
  File RANDOM.c generated by XCOM-I, 2024-04-16 08:46:47.
*/

#include "runtimeC.h"
#include "procedures.h"

int32_t
RANDOM(void)
{

  // rbase = rbase * rmult; (2)
  {
    int32_t numberRHS = xmultiply(getFIXED(8), 671297325);
    putFIXED(8, numberRHS);
  }
  // return shr(shr(rbase, 16) * range, 16); (3)
  return SHR(xmultiply(SHR(getFIXED(8), 16), getFIXED(4)), 16);
}

/*
  File main.c generated by XCOM-I, 2024-04-16 08:46:47.
  XPL/I source-code file used: Example-6.18.6.xpl.
  To build the program from the command line, using defaults:
          cd Example-6.18.6/
          make
  View the Makefile to see different options for the `make`
  command above.  To run the program:
          Example-6.18.6 [OPTIONS]
  Use `Example-6.18.6 --help` to see the available OPTIONS.
*/

#include "runtimeC.h"
#include "procedures.h"

/*
  Memory Map:
           Address (Hex)        Data Type        Variable
           -------------        ---------        --------
              0 (000000)        FIXED            I
              4 (000004)        FIXED            RANDOMxRANGE
              8 (000008)        FIXED            RANDOMxRBASE
*/

int
main(int argc, char *argv[])
{

  if (parseCommandLine(argc, argv)) exit(0);

  // do i = 1 to 100; (0)
  {
    int32_t from0, to0, by0;
    from0 = 1;
    to0 = 100;
    by0 = 1;
    for (putFIXED(0, from0);
         getFIXED(0) <= to0;
         putFIXED(0, getFIXED(0) + by0)) {
      // output = RANDOM(100000); (1)
      {
        int32_t numberRHS = ( putFIXED(4, 100000), RANDOM() );
        string_t stringRHS;
        strcpy(stringRHS, fixedToCharacter(numberRHS));
        OUTPUT(0, stringRHS);
      }
    }
  } // End of DO for-loop block

  if (LINE_COUNT)
    printf("\n"); // Flush buffer for OUTPUT(0) and OUTPUT(1).
  return 0; // Just in case ...
}

The comparison, I think, is both tantalizing and mysterious. Tantalizing, because you can almost see how it works. And mysterious, because you can't quite see it. But I digress!

The next step, of course, is to compile the C programs. If you are on a relatively-normal Linux or Mac OS system, then compilation can be accomplished like this, assuming that you're still in the XCOM-I/ working directory:

make -C Example-6.18.6

This produces an executable program within the Example-6.18.6/ folder that's also called Example-6.18.6.

Aside: On my system, which is a pretty-normal Linux system with gcc as the C compiler, the instructions I give you succeed basically without any compiler warnings. Whereas with clang as the C compiler there can be quite a few compiler warnings, and on a Mac OS system there may be an annoyingly-large number of compiler warnings ... all of which are nuisances without value, as far as I can tell, not to mention concerning things you can do nothing about. If that's what happens for you, I'd suggest turning off all compiler warnings by adding "EXTRA=-w" to all of the make commands, as in:

make EXTRA=-w -C Example-6.18.6

Important note: make will be fooled — badly, compiling multiple times with the wrong target names! — if the path to your current working directory contains any spaces. You can help make out in that case by explicitly specifying the target name:
make -C Example-6.18.6 Example-6.18.6

But the world being what it is, you most likely don't have a setup like mine, so some alterations in the instructions above may be needed in your situation. For one thing, your C compiler may not be gcc. Actually, the default assumption isn't that the compiler is gcc, but rather that the shell's environment variable CC holds the name of the compiler. But if that's not true, or if you're not using gcc, then the command for make will differ a little. For example, suppose you use clang instead of gcc. Just change the build commands to read:

cd XCOM-I
make -C Example-6.18.6 CC=clang

Or if you're on Windows — and other than myself who isn't, really? — then you'd want to add the filename extension ".exe" to your executable. That could be done like so:

cd XCOM-I
make -C Example-6.18.6 TARGET=Example-6.18.6.exe

Aside: The Makefile itself lists other possible alterations, any or all of which can be used in combination with the others.

Finally, you can now run the now-fully-compiled program:

Example-6.18.6/Example-6.18.6

What this particular sample program does is just to print 100 random numbers, so that's what should happen. (I say they're "random", but the seed for the random-number generator is hard-coded and always the same as 1, so you'll get the same 100 numbers every time you run the program.)

Aside: Just like XCOM-I, the compiled application also has a variety of command-line options that may affect how it runs. None of them are really applicable to this particular sample program, but you could see them with the command "Example-6.18.6/Example-6.18.6 --help".

If you want to see a meatier standard XPL example, you can look instead at McKeeman's ANALYZER program. If you happen to have a copy of A Compiler Generator (see Section 7.1, Chapter 10, and Appendix 6), it provides reasonably-detailed operating instructions and theory of operation for ANALYZER. Basically, ANALYZER is used for analyzing a given BNF grammar and printing a report about it. Thus McKeeman recommends ANALYZER for developing BNF grammars and debugging them. There's an associated data file as well, SKELETON.bnf, which contains the BNF description for a very simple language (called "SKELETON", of course) that's the basis for some of the discussion in the book. The steps for compiling and running ANALYZER with SKELETON.bnf as input are just what you'd expect from the discussion earlier:

cd XCOM-I
XCOM-I.py --xpl Tests/ANALYZER.xpl
make -C ANALYZER
ANALYZER/ANALYZER <Tests/SKELETON.bnf

If you're interested, you can compare the report on the SKELETON grammar produced by the steps above, versus the report printed in Figure 7.1.6 of McKeeman in 1969. You will notice slight differences between the two, but not (I think) anything substantive. Remember that the report printed in the book would have been manually typeset, always with the possibility of human error, and with editorial decisions having been made to affect the aesthetics and publishing cost, possibly to the detriment of literal 100% accuracy.

Aside: Reducing discussion of ANALYZER to just whether or not it works the same today as it did back in 1970 is doing it a injustice, because ANALYZER has interesting capabilities in its own right. If you look at the reports I mentioned above, you'll notice that one thing ANALYZER includes in these reports is a large section consisting of XPL DECLARE statements. These DECLARE statements, if plugged into the template XPL code provided elsewhere in McKeeman, are what's needed to create a compiler (written in XPL, of course) for the grammar being analyzed. In other words, the title of the book (A Compiler Generator) isn't a misnomer. This XPL code in the report isn't incredibly useful as-is, because it's formatted in a manner that's not immediately compilable. However, ANALYZER also allows you to "punch" separate punch-cards that do contain immediately-compilable XPL. More on that in a moment.

One of the legacy XPL programs provided for demonstration purposes is actually a compiler for the SKELETON language, though calling it a "compiler" is a bit of a stretch, since all it does is analyze statements in the SKELETON language for correctness, and doesn't produce any object code. The idea behind the SKELETON program was that since it's such a small program (~300 lines of XPL), it would be easy to experiment with. The SKELETON language itself is very bare-bones, in that it only has assignment statements in it, and those assignment statements have the form IDENTIFIER = EXPRESSION;. Identifiers follow the same rules as they do in XPL, except that lower-case letters aren't supported. There are no declarations, no arrays, no strings, and "expressions" are basically whatever normal arithmetical expressions you can form from just identifiers, literal decimal numbers, and the tokens + - * / ( and ). Anyway, you can compile and run it like so:

cd XCOM-I
XCOM-I.py --xpl Tests/SKELETON.xpl
make -C SKELETON
SKELETON/SKELETON

SKELETON being relatively uninteresting, the Tests/ folder also contains a similar but more-complex BNF grammar for XPL itself. Analyzing the XPL grammar with ANALYZER is very slightly more complex than analyzing the SKELETON grammar was. That's because the XPL grammar file used as input for ANALYZER contains a directive which the SKELETON grammar did not. The culprit is $PUNCH, which directs ANALYZER to output the XPL code it generates to a "punch device". Technically, that means a file has to be "attached" to the punch device, or else the data has nowhere to go. But by default ANALYZER doesn't attach any file to the punch device. On an IBM System/360, this would have been handled by means of separately-provided Job Control Language (JCL), and in particular by DD statements in the JCL. Naturally, we have no JCL in our XCOM-I based system, but the equivalent to JCL would be ANALYZER's command-line switches. The command-line switches for simulating DD statements are --ddi and --ddo. Putting that all together, we can analyze the XPL grammar, with the punch device attached to a file called PUNCH.txt, using the following command:

ANALYZER/ANALYZER --ddo=2,PUNCH.txt <Tests/XPL.bnf

Or since ANALYZER accepts the input grammar it analyzes on "device 0" (attached by default to stdin, which is why we usually pipe in the input via <), we could even run it instead as:

ANALYZER/ANALYZER --ddi=0,Tests/XPL.bnf --ddo=2,PUNCH.txt <Tests/XPL.bnf

We'll come to a still meatier example in a moment, but before doing that, the next section will be a necessary digression.

Emulating the IBM 360 CPU

XPL or XPL/I programs compiled by XCOM-I are native to your own local computer system, and if XCOM-I has done its job properly, there should be no lingering dependencies on the IBM 360 in the compiled program.

Nevertheless ... XPL was designed to be a program for generating compilers, so a lot of legacy XPL or XPL/I code is actually the source code of a compiler for one or another computer language. Suppose for example that we had an XPL program called X that was a compiler for a programming language called "X". It's true that if we compile the source code for X with XCOM-I that we'll get executable code for X that runs on our native Linux/Mac/Windows computer (rather than, say, an IBM 360). But it's also true that if we use that native version of X to compile source-code in the X language, say for a program called Y, then the output of the compiler will still be an object file for the IBM 360. So we still wouldn't be able to run Y on our native computer, and would still need an IBM 360 to run Y.

Which means that even with XCOM-I in hand, it's still very useful to have an IBM 360 emulator in which to run not the programs produced by XCOM-I, but the programs produced by the compilers produced by XCOM-I.

If you're already an IBM 360 expert or a wannabe IBM 360 user, then perhaps a full-scale IBM System/360 emulator such as Hercules might work well for you. As for myself, having tried to figure out without success how to use Hercules, I have to concede that I'm neither a 360 expert nor wannabe, and have no desire to confront that learning curve. I'd much prefer a lightweight IBM 360 emulator, with essentially no learning curve at all. (And a lightweight emulator is far more in line with Virtual AGC's goal of integrating Space Shuttle computing support into spaceflight simulation programs than a full-blown mainframe emulator would be.)

As it happens, there is an available candidate for such lightweight IBM 360 emulator. This emulator, called sim360, was written by the same Daniel Weaver who I've also mentioned earlier as the author of an XPL-to-C translator for standard XPL. You can find the source code for sim360 in the Virtual AGC software repository, but the official place to get the most up-to-date version is Dan's own site. That hyperlink is disguised as a Pascal compiler, which is also there, but don't be confused: There's an IBM 360 emulator in there!

Aside: There is a limit, of course, to how "lightweight" an IBM 360 emulator can be while still remaining useful. To run programs originally written in XPL or XPL/I, for example, just emulating the CPU's instruction set is not enough. Such XPL or XPL/I programs depend on there being an operating system in place to allow operations like inputting data (from a files or keyboards) or outputting data (to a displays, printers, or files) to occur. But an XPL program doesn't actually interact directly with the operating system. Instead, an XPL program expects that there's a separate program called the submonitor running along side of it, and the XPL program makes its low-level requests for input, output, allocation of memory, and so on, to this submonitor program. In fact, there will be a whole section later on that describes XPL's built-in functions, some of which rely on the submonitor, as well as another section covering the MONITOR procedure, all of whose functionality comes from the submonitor. But the point is that to be useful to us, a lightweight emulator like sim360 must provide enough submonitor capability to respond correctly when the XPL program makes these requests of the submonitor even though the IBM 360 operating system is entirely absent. Fortunately, although at the present writing it's a work in progress, sim360 does so. Thanks, Dan!

By the way, if in place of sim360 you were to use a full-scale IBM 360 simulator such as Hercules, an emulation of the XPL or XPL/I submonitor would not be built into it. Rather, you would have to assemble the IBM 360 Basic Assembly Language (BAL) source code for the submonitor, and presumably contrive the JCL needed to run the submonitor program along-side your compiled XPL program on the emulator. Fortunately, although we don't need it if we are using sim360, we do have source code for the submonitor:

For McKeeman's XCOM, it's XMON.bal.

For SUNY's XCOM 4.5, which we've not discussed yet but which we'll get to in the next section, it's the slightly-different XPLSM.bal.

For Intermetrics's unavailable XCOM, it's the collection of BAL source-code files known as MONITOR.ASM. You may wonder why that matters, given that the source code for the Intermetrics version of XCOM has not survived, and thus we couldn't run it anyway? (After all, the whole point of XCOM-I in the first place is to fill in for the missing Intermetrics XCOM, and XCOM-I needs no submonitor!) Well presumably, if we compile HAL/S source code (like PFS or BFS) using HAL/S-FC, we'd expect PFS or BFS to need the same submonitor as the Intermetrics XCOM would have. Probably. Or at least, maybe!

What we don't have in this scenario is a BAL assembler, nor an IBM 360 linker program, so in spite of this wealth of source code for submonitors, in the end we still have no way to build the submonitor programs that I'm aware of.

Assuming you're working from Virtual AGC's archived copy, the first thing you have to do is to build sim360 itself as follows:

cd XCOM-I
make sim360

If you've added XCOM-I/ into your PATH as I recommended earlier, then sim360 will now be in your PATH as well.

If you want to try out sim360, you'll need an IBM 360 load file for some actual program. In the next couple of sections, we'll produce a lot of just load files. For now, I'll just point out the IBM 360 load file for a simple Hello World program is available in our Tests folder, and you can run it like so:

sim360 -o0ET stdout -i0AT stdin Tests/HELLO.obj

The command-line switches for sim360 probably look mysterious, but they're not so bad. You can see a full list of command-line options with the command "sim360 --help", but what the mysterious switches in the example above mean is this:

-o0ET stdout: Causes OUTPUT(0) to send data to stdout and automatically translate it from EBCDIC to ASCII.
-i0AT stdin: Causes INPUT(0) to receive data from stdin and automatically translate it from ASCII to EBCDIC.

Insane Bootstrapping!

And I do mean insane.

Regarding the "still meatier example" that I promised (threatened?) earlier, here it is with a vengeance! We're going try bootstrapping some legacy standard-XPL compilers. By a "legacy" compiler, I mean an IBM 360 XPL compiler from the good-old 1960's or 1970's, against which we can hopefully accurately measure the authenticity of XCOM-I's behavior, at least in so far as standard XPL is concerned. We happen to have XPL source code and IBM 360 builds of more than one such legacy compiler:

**Available Legacy IBM 360 XPL Compilers**
Designation	Notes	Where I Got it From	Legacy XPL Source Code for XCOM Proper	Legacy XPL Source Code for Library	Legacy IBM 360 Executable	"Modern" Patch(es) for use with XCOM-I
XCOM3	This is the version of XCOM listed in the book A Compiler Generator. Printouts from it identify it as "XCOM III". In reality, there's no way to know if it's identical to the book without comparing it line by line, which I have not done. The notes in the download file state that the files were recovered from a legacy computer tape from the Stanford Computation Center, dated 1969-01-23. The contemporary notes on the tape itself hint that it is the source code from the book, but stop short of saying explicitly that it's the precise version that appears in the book. The contemporary notes further state that "THE BINARY IMAGE OF THE XCOM COMPILER ... [is] THE RESULT OF USING XCOM TO COMPILE ITSELF FROM THE SOURCE CARDS" However, if you run the binary executable, its printout says that it was built on May 2, 2001, at 13:40:40.19. I'd suggest it's rare to find files created in 2001 on tapes made in 1969. I'll give pretty-convincing reasons later to believe that while this version of the source code was compiled to get this executable, a different version of the compiler itself was used to do so. Regardless, how this 2001 executable made its way onto the 1969 tape, or at least into the download file, I cannot say.	University of Toronto XPL page	XCOM.xpl	XPL.LIBRARY.xpl	XCOM3-20010502.obj	n/a
XCOM45	The accompanying notes indicate that this is from a computer tape from the State University of New York, Stonybrook. The notes from the tape itself date it 1977-05-22. No information is given about how the binary executable of the compiler was created. Its printout identifies itself as "XCOM 4.5", compiled on July 19, 1976, at 21:05:40.16. Fortunately, unlike XCOM3, that's consistent with the date of the tape itself. But for reasons we'll come to later, I believe that while the provided executable is indeed compatible with the version of the XPL source code provided, it was nevertheless not used to compile itself.	Dan Weaver's XPL project	XCOM4.5.xpl	XPLIB4.5.xpl	XCOM45-19760719.obj	patch0.c

To avoid confusion, from now on I'll refer to these specific compilers as XCOM3 or XCOM45, rather than via vague references like "McKeeman XCOM" as I've been doing up to now. Also from now on, if I'm speaking generically about any of these old compilers, I'll simply refer to XCOMx. (For the legacy executables, I'll generically call those XCOMx-YYYYMMDD.) As for the "patches" in the rightmost column, XCOM-I has a framework for providing C-language patch files to replace the functionality the original XPL source code implemented by inline IBM 360 machine code. Fortunately, XCOM3 has no such inlined code, and XCOM45 has merely has a single block of 19 consecutive machine-code instructions. The patch file, of course, was not on the legacy computer tape; I wrote it myself.

"Bootstrapping" is a process employed by folks who feel compelled to write computer-language compilers that can "self-compile"; in the case of XPL, that means that an XCOM executable can compile the XPL source code for XCOM, and get back an executable identical to itself when it does so. How could it have possibly worked the very first time back in the 1960's, when there was XPL source code but no XCOM executable yet to compile it with? Well, obviously the initial versions of XCOM must have been very cut-rate; and they must have been compiled with something other than XCOM, something presumably with very-constrained functionality itself and not written in XPL itself. But over time, as XCOM became more and more capable, XCOM was eventually able to compile itself. That process is called "bootstrapping", as in "hauling yourself up by your own bootstraps". We're now going to try doing that with XCOMx.

Aside: To be clear, XCOM-I is not self-compiling. It's written in Python, not XPL. Rather, it's XCOM3 and XCOM45 that I expect to be self-compiling once they're bootstrapped. In the bootstrapping scenario, XCOM-I is that other compiler used to compile the initial version of XCOMx before an XCOMx executable exists. Except that unlike the normal bootstrapping scenario, XCOM-I, XCOM3, and XCOM45 are full-blown compilers rather than being the initial weak-and-feeble versions thereof.

Also to be clear, the use case for XCOM-I I'm personally interested in is building HAL/S-FC and then forgetting that XPL and XCOM-I even exist.

And we don't need a single one of these XCOMx programs to do that. But still, it's going to go much more smoothly if XCOM-I is believed to work properly in the first place. And if we can use XCOM-I for bootstrapping XCOMx, that will give us a certain degree of confidence that XCOM-I does in fact work as intended. And if in that process XCOM-I can be made useful for somebody who (unlike myself) actually wants to play around with XPL, then why not?

Here's the bootstrapping game-plan as it applies to any or all of the legacy XCOMx compilers:

We'll use XCOM-I to compile the XPL source code for XCOMx. That will give us an executable which I'll call XCOMx-native that runs natively on our Linux, Windows, or Mac computers.
We'll then use XCOMx-native to compile the XPL source code for XCOMx again. That will give us an executable which I'll call XCOMx-360 that's native to the IBM 360. I.e., it only runs on an IBM 360 or a simulation thereof.
Finally, we'll run XCOMx-360 under sim360 to compile the source code for XCOMx one last time. That will give us an executable which I'll call XCOMx-360A. If everything worked perfectly, XCOMx-360A should be byte-for-byte identical to XCOMx-360.
Finally finally, we'll run the legacy XCOMx-YYYYMMDD executable under sim360 to compile source code for XCOMx one really-truly last time. That will give us an executable which I'll call XCOMx-360C. If everything worked perfectly, XCOMx-360C should be byte-for-byte identical to XCOMx-360.

That's a lot of XPL compilers floating around to keep straight. Here's a synopsis of the XCOM variants relevant to the discussion:

**Summary of XPL Compilers in Bootstrapping Process**
Compiler Variant	Compiler's Own Source-Code Files	Compiler Compiled With	Compiler Runs on IBM 360?	Compiler Runs on Linux/Mac/Windows	Source-Code Language and Encoding Accepted By Compiler	Object Files Directly Output By Compiler	Encoding of Textual I/O When Running the Output Object Files
XCOM-I.py	XCOM-I.py et al.	n/a	No	Yes	XPL or XPL/I, ASCII coding	C-language files	ASCII
XCOM3-native	XCOM.xpl, XPL.LIBRARY.xpl	XCOM-I.py	No	Yes	XPL, ASCII coding	IBM 360 load file	EBCDIC
XCOM45-native	XCOM4.5.xpl, XPLIB4.5.xpl	XCOM-I.py	No	Yes	XPL, ASCII coding	IBM 360 load file	EBCDIC
XCOM3-360.obj	XCOM.xpl, XPL.LIBRARY.xpl	XCOM3-native	Yes	Only under sim360 or similar	XPL, EBCDIC coding	IBM 360 load file	EBCDIC
XCOM45-360.obj	XCOM4.5.xpl, XPLIB4.5.xpl	XCOM45-native	Yes	Only under sim360 or similar	XPL, EBCDIC coding	IBM 360 load file	EBCDIC
XCOM3-360A.obj	XCOM.xpl, XPL.LIBRARY.xpl	XCOM3-360.obj	Yes	Only under sim360 or similar	XPL, EBCDIC coding	IBM 360 load file	EBCDIC
XCOM45-360A.obj	XCOM4.5.xpl, XPLIB4.5.xpl	XCOM45-360.obj	Yes	Only under sim360 or similar	XPL, EBCDIC coding	IBM 360 load file	EBCDIC
XCOM3-20010502.obj	XCOM.xpl, XPL.LIBRARY.xpl	n/a	Yes	Only under sim360 or similar	XPL, EBCDIC coding	IBM 360 load file	EBCDIC
XCOM45-19760719.obj	XCOM4.5.xpl, XPLIB4.5.xpl	n/a	Yes	Only under sim360 or similar	XPL, EBCDIC coding	IBM 360 load file	EBCDIC
XCOM3-360C.obj	XCOM.xpl, XPL.LIBRARY.xpl	XCOM3-20010502.obj	Yes	Only under sim360 or similar	XPL, EBCDIC coding	IBM 360 load file	EBCDIC
XCOM45-360C.obj	XCOM4.5.xpl, XPLIB4.5.xpl	XCOM45-19760719.obj	Yes	Only under sim360 or similar	XPL, EBCDIC coding	IBM 360 load file	EBCDIC

There are other possibilities that the table doesn't cover, for compiling stuff with other stuff . Those are left as an exercise for the reader, though I don't think they have any relevance vis à vis bootstrapping.

In the next four subsections, I'm going to make a detailed examination of the steps needed to actually carry out the bootstrapping manipulations and to show that the bootstrapping was successful. Or not! (But it was successful for me.)

Aside: If you're on Linux or Mac OS and you don't have the patience to read through all of the garbage just mentioned, the Makefile provided with XCOM-I can also perform all 4 steps of the bootstrapping roadmap for you automatically. You could then just skip past the next four subsections. Unfortunately, I'm sure that the Makefile won't work in Windows as-is, but that it could be made to work. I'm not going to personally research the Windows situation, but you can look at the Makefile yourself to see what has to be done. If you figure it out, drop me a line and tell me about it. At any rate, the commands to bootstrap XCOM3 and XCOM45 using the makefile are as follows. It'll simply print messages at the end that it succeeded or failed:
cd XCOM-I
make

Step 1: Compiling XCOMx source code with XCOM-I to produce XCOMx-native

For XCOM3:

cd XCOM-I
XCOM-I.py --xpl --output=XCOM3-native Tests/XCOM.xpl
make -C XCOM3-native

For XCOM45:

cd XCOM-I
XCOM-I.py --xpl --lib-file=sim360-source/port/XPLIB4.5.xpl --output=XCOM45-native sim360-source/port/XCOM4.5.xpl
make -C XCOM45-native

This is very much as described earlier, so I won't duplicate that discussion here.

Of course, the proof-of-the-pudding for a new compiler like XCOMx-native is whether or not the programs you compile with it produce the same results as the programs compiled by the contemporary XCOMx did, back in the day. As it happens, McKeeman provides two sample XPL programs, Example-6.18.1 and Example-6.18.2, with which we can test just that. The former demonstrates IBM 360 object-code production, while the latter demonstrates error handling. The complete XCOM3 reports for these programs are given in the book, so it's possible to compare the original reports from XCOM3, including the interspersed IBM System/360 assembly language produced by the compiler, versus the corresponding reports created now by XCOM3-native:

Today's XCOM Reports versus *A Compiler Generator*
XPL Source-Code File	Compiled by XCOM3 for the book A Compiler Generator (1970)	Compiled by XCOM3-native for The Virtual AGC Project (2024)
Tests/Example-6.18.1.xpl	Report	Report
Tests/Example-6.18.2.xpl	Report	Report

As for a similar check of XCOM45-native, I'm not aware of any legacy reports that have survived from the olden days.

Step 2: Compiling XCOMx source code with XCOMx-native to produce XCOMx-360

For XCOM3:

cd XCOM-I
XCOM3-native/XCOM3-native --ddi=0,Tests/XCOM.xpl --ddi=2,XPL.LIBRARY.xpl --raf=B,3600,1,XCOM3-360.obj --raf=B,3600,2,XCOM3-360.dat --raf=B,3600,3,XCOM3-360.str >XCOM3-360.rpt

For XCOM45:

cd XCOM-I
XCOM45-native/XCOM45-native --ddi=0,sim360-source/port/XCOM4.5.xpl --ddi=2,sim360-source/port/XPLIB4.5.xpl --raf=B,7200,1,XCOM45-360.obj --raf=B,7200,2,XCOM45-360.dat --raf=B,7200,3,XCOM45-360.str >XCOM45-360.rpt

As you can see, this involves a bit of command-line complexity that step 1 did not. The command-line options are all there to compensate for the lack of Job Control Language (JCL), and in particular to provide replacements for missing DD statements. For one thing, there's an extra file which needs to be attached as an input, because XCOMx always needs to include the library file containing COMPACTIFY. Also, XCOMx wants to use several "random-access files" for storing object code and what not, and we have to attach those files. (Refer to the FILE entry in the section on XPL's built-in library functions.) Those random-access files are as follows:

FILE(1) — The compiled IBM 360 object code.
FILE(2) — A scratch file (i.e., temporary working space) for data.
FILE(3) — A scratch file for strings.

As far as the numbers 3600 or 7200 appearing in the command line, XCOMx uses a hard-coded constant to determine the record sizes for these random-access files, which happens to have the default value of DISKBYTES=3600 for XCOM3, but DISKBYTES=7200 for XCOM45.

Aside: Incidentally, XCOM3 has plenty of quirks specific to it, quite aside from any quirks XPL more-generally may have as a computer language. XCOM3-native and XCOM3-360 inherit these quirks. I mention this just in case you become excited about using XCOM3-native and start writing new XPL programs for it! Here are a few quirks I've noticed:

It only accepts upper-case for identifiers and keywords ... in spite of the fact that every speck of XPL source code in A Compiler Generator is printed in lower case! But I've provided a trick to work around that in XCOM3-native: In XCOM3-native's --ddi command-line switches that attach the XPL source-code files, you can optionally suffix them with ",U". This tells XCOM3-native to automatically translate the associated files to upper case. It's not a perfect workaround, since it translates comments and quoted strings to upper case too, but that's life! We didn't actually need to do that for these source-code files, though, because they're already completely upper-case.
It requires an EOF token at the end of the source code. Other than correcting some EBCDIC-to-ASCII translation bugs that were present before I got some of the files, addition of EOF at the ends was my sole code change.

It won't allow you to have two division operations (including MOD as a division) in the same statement, instead emitting an error message saying that it requires a "busy register".
... and who knows what else?

The *.rpt files produced by these compilations are worth looking at, if only to verify that somewhere in each of them is a line reading "NO ERRORS WERE DETECTED".

I had hoped at this point to find that XCOMx-360.obj is byte-for-byte identical to the corresponding legacy executable XCOMx-YYYYMMDD.obj. Unfortunately, my hopes have been dashed: They are interestingly-similar but not identical. But as I mentioned earlier, I expect to give convincing evidence that neither XCOMx-YYYYMMDD.obj was not actually self-compiled, and instead was presumably compiled by a different version of XCOM. If so, it would fully explain the mismatch versus XCOMx-360.obj. But until I show you that evidence, it would certainly be reasonable to maintain the possibility that the mismatch is due to a bug in XCOM-I.

Step 3: Compiling XCOMx source code with XCOMx-360 under sim360 to produce XCOMx-360A

For XCOM3:

sim360 --size=3600 -o0ET stdout -i0AT Tests/XCOM.xpl -i2AT XPL.LIBRARY.xpl -f1wb+ XCOM3-360A.obj -f2wb+ XCOM3-360A.dat -f3wb+ XCOM3-360A.str XCOM3-360.obj

For XCOM45:

sim360 --size=7200 -o0ET stdout -i0AT sim360-source/port/XCOM4.5.xpl -i2AT sim360-source/port/XPLIB4.5.xpl -f1wb+ XCOM45-360A.obj -f2wb+ XCOM45-360A.dat -f3wb+ XCOM45-360A.str XCOM45-360.obj

These command-line switches differ in detail from the ones we had to with XCOMx-native, but I think you can see that they're just the same old replacements for JCL DD commands, but employing a different syntax.

For this denouement of the bootstrapping progress, what are the results? XCOM3 and XCOM45 are in the same boat in this regard, so I'll just explain what we see with XCOM45. You can check out XCOM3 yourself if you like.

Firstly, the build of XCOM45-360A.obj succeeds with no errors. If XCOM45 has successfully built itself, I claimed above that XCOM45-360.obj and XCOM45-360A.obj will be byte-for-byte identical. That was actually an oversimplification. Anybody who runs XCOMx will immediately notice that the first thing reported by it is the date and time at which the compiler itself was compiled. To be able to do that, an XCOMx executable must have such a timestamp embedded within it somewhere. Which further means that using identical XCOMx executables to compile identical XPL source code for XCOMx will nevertheless produce output files that embed different timestamps, and therefore will not be byte-for-byte identical. But there are tricks you can use to work around that. Here's one of them: I made hexadecimal dumps of the two object files, and then used a file-comparison utility to compare the hex dumps side-by-side. Here's a screenshot of the only difference found between the two files, with XCOM45-360.obj on the left and XCOM45-360A.obj on the right:

What are the 3 bytes that are different? Well, XCOM45-360.obj was built (according to the report it prints out) at time 19:53.37.27, while XCOM45-360A.obj was built at 17:5:6.24. Now, XCOM gets the compile time using the built-in function TIME_OF_GENERATION, which returns the number of centiseconds since midnight. For XCOM45-360.obj, that computes to 100*(19*3600+60*53+37.27)=7161727=0x6D477F. Similarly, for XCOM45-360A.obj, 100*(17*3600+5*60+6.24)=6150624=0x5DD9E0, which (surprise!) matches the screenshot above. (Recall that these IBM 360 programs all store their integer data in big-endian form, so that the most-significant bytes come first in memory.) In short, XCOM45-360.obj can indeed accurately build itself, and bootstrapping XCOM45 has been 100% successful!

In other words, it works! XCOMx can be bootstrapped and accurately builds itself.

Step 4: Bonus Cross-Comparison of Files Built by XCOMx-360.obj Versus Files Built by Legacy XCOMx Executables

In step 2 above, I noted that XCOMx-360.obj, disappointingly, is not byte-for-byte identical to the corresponding legacy IBM 360 executable load file XCOMx-YYYYMMDD.obj. I had further speculated, but could not prove, that this was because those legacy executables had themselves been built using different versions of XCOMx. For example, perhaps XCOM45-19760719.obj was built by XCOM 4.4, or the perhaps the legacy XCOM3-20010502.obj was built by XCOM 3.2. Now comes the evidence!

Suppose that my speculation is true. In that case, if we rebuild XCOMx source code using XCOMx-YYYYMMDD.obj we should find that:

The executable load file we get by doing that also should not match XCOMx-YYYYMMDD.obj byte-for-byte, proving that XCOMx-YYYYMMDD.obj had indeed not been self-built.
But it should match XCOMx-360.obj byte-for-byte, giving us a tremendously warm, fuzzy feeling of independent confirmation that the bootstrapping process was 100% successful.

So let's try that. As usual, here are the instructions for doing the compilations:

For XCOM3:

sim360 --size=3600 -o0ET stdout -i0AT Tests/XCOM.xpl -i2AT XPL.LIBRARY.xpl -f1wb+ XCOM3-360C.obj -f2wb+ XCOM3-360C.dat -f3wb+ XCOM3-360C.str XCOM3-20010502.obj

For XCOM45:

sim360 --size=7200 -o0ET stdout -i0AT sim360-source/port/XCOM4.5.xpl -i2AT sim360-source/port/XPLIB4.5.xpl -f1wb+ XCOM45-360C.obj -f2wb+ XCOM45-360C.dat -f3wb+ XCOM45-360C.str XCOM45-19760719.obj

We could use the same technique as in the preceding section to check that XCOMx-360.obj is identical to XCOMx-YYYYMMDD.obj, taking into account the embedded timestamp. I'll let that as an exercise for the reader. But there's also an automated way to do the check, and obviously that's somewhat easier.

I mentioned earlier that if you're on Linux (or possibly Mac), then the makefile will do the entire bootstrapping process for you, including this byte-for-byte check. It actually uses a simplified check (in the Makefile, see how the cmp command is used), in which it just counts the number of mismatched bytes, and concludes that if there were 3 or less mismatched bytes, then there's an overall match. It doesn't bother to check that the mismatched bytes are valid timestamps.

QED.

HAL/S-FC Through the Lens of XCOM-I

Compiling HAL/S-FC Itself

The examples in the preceding section are all very well and good, but the Holy Grail is compilation via XCOM-I of HAL/S-FC itself, and subsequently being able to successfully run HAL/S-FC. Just as there was a big leap in complexity from being able to compile ANALYZER (~1500 lines of XPL) to being able to compile XCOMx (~4200 lines of XPL), there's a much larger leap to being able to compile HAL/S-FC (>30,000 lines of XPL, for some of HAL/S-FC's passes). The steps for doing so are also somewhat more complex, so I've once again taken the pains to simplify them for you with a Makefile; and this time, I won't insist upon droning on in a sleep-inducing fashion about those details.

To compile HAL/S-FC itself, switch over to the working directory where the HAL/S-FC source code actually is stored, and compile it there, as follows:

cd yaShuttle/"Source Code"/PASS.REL32V0
make

HAL/S-FC actually consists of 7 passes, each with its own, standalone executable, and the instructions above (if they succeed), will thus create 7 new programs in the PASS.REL32V0/ folder: PASS1, FLO, OPT, AUX, PASS2, PASS3, and PASS4. You may or may not want to move these into your PATH. The reason you might not want to move them into your PATH is that a number of compilers written in XPL use this kind of naming pattern, and also have program names like PASS1, PASS2, and PASS3.

Using PASS1 to Process Programs Written in HAL/S

Now that you've built HAL/S-FC itself, using HAL/S-FC to compile HAL/S source code is a bit trickier, because I haven't yet built a Makefile to simplify it for you. HAL/S-FC's development, and of course the IBM System/360's development before that, greatly preceded the modern notion that the user experience must be simplified to the point where no knowledge at all is required to use the software. The 7 compiler passes, or at least some substantial number of them, are run in succession. Each pass receives data from the preceding pass, and sends data into the next pass either explicitly via files, or else via COMMON memory. It's thus necessary not merely to run the passes in the correct order, but to use all of the command-line settings properly to make sure that filenames match up from pass to pass.

At first, though, let's not worry about all of that stuff. Let's just think about running PASS1 by itself, and ignore all of the other passes. It's obviously the most important pass, since where would the other passes be without it? Nowheresville, baby, that's where!

Besides, we also have my previous manually-created port (HAL_S_FC.py) of PASS1 from XPL/I to Python with which to compare to PASS1 as created by XCOM-I. Speaking of which here's how one might process a simple HAL/S program using HAL_S_FC.py:

HAL_S_FC.py SRN --hal=SOURCECODE >REPORT.txt

Pretty simple looking! Unfortunately, the simplicity of HAL_S_FC.py was aided somewhat by the fact that I never intended to port any of the other 6 compiler passes into Python, and hence didn't have to worry about the interconnections between the compiler passes. In contrast, here's a sample of how to run PASS1 as created by XCOM-I to process that same HAL/S program:

./PASS1 --parm="SRN" --pdsi=5,ERRORLIB --pdsi=6,ACCESS --ddi=0,SOURCECODE.hal --raf=B,7200,1,HALMAT.bin --raf=B,1560,2,LITFILE.bin --raf=B,3360,6,VMEM.bin >REPORT.txt

There's obviously a bit of added complexity here, or at least the appearance of added complexity, so let me go through these things one-by-one:

./PASS1 is, of course, the program PASS1 of HAL/S-FC itself. The "./" prefixing is to help some versions of Linux that don't automatically look in your current working folder for program names. At the very least, it's harmless in Linux or Mac OS. Leave it off in Windows!
--parm="SRN": The --parm='...' switch in general is used to pass options to the compiler, replacing the PARM='...' that originally appeared in Job Control Language (JCL) on the IBM 360. The numerous possibilities for inclusion in --parm='...' are discussed later. But specifically, SRN tells the compiler that there are serial numbers in columns 73-80 of the "cards" containing your source code, or more-precisely, that there is no source code in columns 73-80. I typically put all my source code in columns 1-72, just because I think it's a good habit in HAL/S. But you can use the entirety of columns 1-80 if you leave off this command-line option. Of course, if you did not write the HAL source code yourself, you don't really have any choice, and have to stick with whatever's consistent with the actual source code. As for columns 81 and above, you cannot use them for HAL/S source code by any means whatever. Too bad!
The --pdsi=..., --pdso=..., --ddi=..., --ddo=..., and --raf=... switches, in general, all replace the DD commands that originally appeared in JCL on the IBM 360: i.e., they associate specific filenames and (to a certain extent) file types with the device numbers used in XPL source code INPUT(n), OUTPUT(n), and FILE(n) functions.

Among themselves, the seven passes of HAL/S-FC use a somewhat-consistent scheme for associating filenames to file numbers, though not all of the passes necessarily use all of the files. I say "somewhat", because files output by one pass are sometimes input to another pass using a different file number. A useful though seemingly incomplete reference to the various files, is the table on p. 7-4 of the "HAL/S Compiler System Specification".

Here's a handy table that summarizes the file usage of each pass. When you see something italicized, such as HALMAT in --raf=B,7200,1,HALMAT.bin, I mean to imply that you may or may not want to choose a different name than I have, or to use different names for different compiler passes; however, the names must be consistent between the passes outputting them and the passes using them as input. The cells that have been highlighted in color emphasize those places in which the same file must be used on successive passes; in the case of the virtual-memory file, I don't know if the same file is used on successive passes or not, so I haven't highlighted it, but I've been using the same file without any apparent difficulty.

HAL/S-FC Usage	PASS1	FLO	OPT	AUX	PASS2	PASS3	PASS4	Notes
Input HAL source code	`--ddi=0,SOURCECODE.hal` or `stdin`	n/a	n/a	n/a	n/a	n/a	n/a
Primary output listing	`--ddo=0,REPORT.txt` or `stdout`	`--ddo=0,REPORT.txt` or `stdout`	`--ddo=0,REPORT.txt` or `stdout`	`--ddo=0,REPORT.txt` or `stdout`	`--ddo=0,REPORT.txt` or `stdout`	`--ddo=0,REPORT.txt` or `stdout`	`--ddo=0,REPORT.txt` or `stdout`
Output error messages	`stderr`	`stderr`	`stderr`	`stderr`	`stderr`	`stderr`	`stderr`
Extra listing file, plus SPACELIB error messages	`--ddo=2,LISTING2.txt`	`--ddo=2,LISTING2.txt`	`--ddo=2,LISTING2.txt`	`--ddo=2,LISTING2.txt`	`--ddo=2,LISTING2.txt`	`--ddo=2,LISTING2.txt`	`--ddo=2,LISTING2.txt`	This file is produced by PASS1 if "LISTING2" is among the options in PARM='...', and I don't show it in the sample PASS1 invocation above, because LISTING2 isn't among the chosen PARM options. However, SPACELIB directs certain error messages to it as well, so in some ways it's better to have it even if LISTING2 isn't specified.
AP-101S object module output	n/a	n/a	n/a	n/a	(PASS)`--ddo=3,CARDFILE.bin,E`(BFS)`--pdso=3,CARDFILE,E`			For the PASS compiler, I think that this is in the IBM 360 "card" format. For the BFS compiler, I'm unclear what this is. I think, perhaps, that there's a PDS member for each "object" (COMPOOL, PROCEDURE, PROGRAM) compiled, and if you don't clear the PDS, it will be every object for every compilation you've compiled since the last time it was cleared. Which makes sense. However, I don't understand the data in these objects.
Object module output ("deck") Input from template library	`--pdsi=4,TEMPLIB,E`	n/a	n/a	n/a	`--ddo=4,DECK.bin,E`			(PASS2) For the PASS compiler, an optional copy of `CARDFILE`. For the BFS compiler, the only copy. (PASS1) The template library is similar to the collection of header files in C. Used only if `TEMPLATE` is present in the parameter string.
Simulation Data File (SDF) output Input error-message library	`--pdsi=5,ERRORLIB`	n/a	n/a	n/a	`--pdsi=5,ERRORLIB`			The compiler requires access to the library of HAL/S error messages, which is the folder PASS.REL32V0/ERRORLIB/. Note that "ERRORLIB" is literally its name, and it's all you need in the command-line option if you're running the compiler from the PASS.REL32V0/ directory. But if you're running the compiler from a different directory, you need an absolute or relative path to ERRORLIB.
Input access-rights library Output to template library.	`--pdsi=6,ACCESS --pdso=6,TEMPLIB,E`	n/a	n/a	n/a	n/a	n/a	n/a	The compiler also requires access to the library of access rights, namely the folder PASS.REL32V0/ACCESS/. The folder is empty, yes, because you have full access and there are no restrictions. But it's still required! Note that "ACCESS" is literally its name. Like ERRORLIB (see above), just using ACCESS in the command-line option is fine only if you're running the compiler directly from the PASS.REL32V0/ folder. Used only if `TEMPLATE` is present in the parameter string. Note that the output template library should not necessarily be the same as the input library on device 4.
Pseudo-assembly listing output	n/a	n/a	n/a	n/a	`--ddo=7,EXTRA.txt`
TBD	`--pdso=8,TEMPINC`
Source code comparison	TBD
Output or input HALMAT	`--raf=B,7200,1,HALMAT.bin`	`--raf=B,7200,1,HALMAT.bin`	`--raf=B,7200,1,HALMAT.bin`	`--raf=B,7200,1,AUXMAT.bin`	`--raf=B,7200,1,AUXMAT.bin`	n/a	n/a	This is a "random-access file" storing "HALMAT". For some passes (such as PASS1) this is output; for others (such as FLO) it is input. When output, it doesn't need to previously exist. Recall that HALMAT is the name of the "intermediate language" into which PASS1 converts your HAL/S source code. This file is the principal output of PASS1, and is what is sent along to the optimizing passes, some of which modify the HALMAT, and eventually to PASS2 to be converted to IBM 360 or AP-101S object code. Alas, due to the apparent non-survival of adequate documentation related to HALMAT, this file is presently incomprehensible even when translated from its binary format into a somewhat-human-readable form.
Literal file	`--raf=B,1560,2,LITFILE.bin`	`--raf=B,1560,2,LITFILE.bin`	`--raf=B,1560,2,LITFILE.bin`	n/a	`--raf=B,1560,2,LITFILE.bin`	`--raf=B,1560,2,LITFILE.bin`	n/a	This is a "random-access file" that the compiler uses to store the values of literals. Note: The value of a literal character-string (or a long bit-string) is actually a 32-bit string descriptor, as opposed to the sequence of individual bytes comprising the string data. These string descriptors in the literal file refer to the areas in memory in which those string-data bytes are stored. So the literal file is useful only as long as the string data memory remains intact, and has not been moved by compaction.
Phase 1 Init/Const work area, Phase 2 code generation work area	`--raf=B,1560,3,ICFILE.bin`	n/a	n/a	n/a	`--raf=B,1600,3,OBJFILE.bin`	n/a	n/a	By `ICFILE`, I mean the INITIAL/CONSTANT queue. I think that the compiler can only hold a certain number of initializers in memory at once, and this file is used to swap any overflow of the initializers from memory. By `OBJFILE`, I mean a work-file that accepts data related to object-code generation when it overflows the provided memory buffer for it. It is not object code as such.
Phase 2 HALMAT work file	n/a	n/a	`--raf=B,7200,4,OPTMAT.bin`	`--raf=B,7200,4,OPTMAT.bin`	`--raf=B,7200,4,OPTMAT.bin`	n/a	n/a	Secondary HALMAT file. As you can see, we somehow end up with three separate HALMAT files: `HALMAT.bin`, produced by PASS1 and used by lots of other passes. `OPTMAT.bin`, produced by OPT and apparently not used. `AUXMAT.bin`, produced by AUX and used by PASS2. Is `OPTMAT.bin`, perhaps, the same as `AUXMAT.bin`?
Phase 3 paging area	n/a	n/a	n/a	n/a	n/a	`--raf=B,1680,5,SDF.bin`	n/a
Statement data communication area	`--raf=B,3360,6,VMEM.bin`	`--raf=B,3360,6,VMEM.bin`	n/a	n/a	`--raf=B,3360,6,VMEM.bin`	`--raf=B,3360,6,VMEM.bin`	n/a	This is a "random-access file" that the compiler uses to store blocks of virtual memory which have been temporarily unloaded from memory.
Input `COMMON` file	n/a	`--commoni=COMMON1.out`	`--commoni=COMMONf.out`	`--commoni=COMMONo.out`	`--commoni=COMMONa.out`	`--commoni=COMMON2.out`	`--commoni=COMMON3.out`	On the IBM 360, certain areas of memory, known as `COMMON`, were expected to retain their contents when one pass of the compiler was unloaded from memory, and the next pass loaded into it. Of course, this will not happen any any computer you'll be likely to use, and XCOM-I supports `COMMON` differently than that. Instead, each pass writes out `COMMON` as a file upon termination, and (optionally) inputs `COMMON` from a file upon startup. By default, output is to a file literally called "COMMON.out", while input is disabled. For PASS1, these defaults are reasonable. However, each subsequent pass wants to take `COMMON` from the preceding pass, so the default input is no longer good.
Output `COMMON` file	`--commono=COMMON1.out`	`--commono=COMMONf.out`	`--commono=COMMONo.out`	`--commono=COMMONa.out`	`--commono=COMMON2.out`	`--commono=COMMON3.out`	n/a	See the comment above. It is, of course, possible to omit all of the `--commono` options entirely, and to use `--commoni=COMMON.out` everywhere. This is perfectly workable, and its only drawback is the lose of transparency of what happens between passes.

Correctness of PASS1

As discussed in the preceding section, our goal is to be able to compile HAL/S programs by means of HAL/S-FC (as compiled itself by XCOM-I), but we can be assisted somewhat in reaching that goal by taking advantage of work done prior to the effort of creating XCOM-I. Specifically, before writing XCOM-I, I had ported PASS1 of HAL/S-FC from XPL to Python. This Python version of PASS1 is called HAL_S_FC.py. In an ideal world, we'd expect PASS1 and HAL_S_FC.py to produce results that were identical in all respects, except for things like (say) timestamps. Given that this isn't an ideal world, as far as we know, and in particular neither I nor my creations are perfect, we should instead expect some discrepancies. In fact, discrepancies are very useful, as they highlight bugs in one or another of the two versions, and correcting those bugs helps me to push both versions closer to validity. Nevertheless, some discrepancies are rooted in the very fact that theses two versions are implemented using very different underlying principles and tools, and cannot be removed by any reasonable means.

What kinds of discrepancies can we expect to find between PASS1 and HAL_S_FC.py? I haven't done any systematic testing at this point, so I can only comment on discrepancies I've observed. We can categorize them somewhat by the output files produced by the programs:

File Type	Output File in HAL_S_FC.py	Output File in PASS1	Notes
Compiler Report	Printed to `stdout`	Printed to `stdout`	Differing timestamps and time usage reported, as expected and desired. HAL_S_FC.py: Does not report the "FREE STRING AREA". This is a design choice, since HAL_S_FC.py manages string data as Python strings. HAL_S_FC.py: Does not report the number of FILE 6 "locates", "reads", and "writes", because HAL_S_FC.py neither needs nor implements the HAL/S-FC virtual-memory module. This is a design choice.
HALMAT	FILE1.bin	Set by `--raf=B,7200,1,HALMAT.bin`.	No discrepancies!
Literals	FILE2.bin	Set by `--raf=B,1560,2,LITFILE.bin`.	`FIXED` and `BIT(<33)` literals in the file are identical for PASS1 vs HAL_S_FC.py. However, string descriptors for `CHARACTER` and `BIT(>32)` literals are different, because HAL_S_FC.py manages string data as Python strings, and thus string data is not stored in the identical memory locations used by PASS1.
Virtual Memory	FILE6.bin	Set by `--raf=B,3360,6,VMEM.bin`.	They differ, because HAL_S_FC.py does not implement HAL/S-FC's virtual-memory module at all. The virtual-memory file produced by HAL_S_FC.py is in fact empty.

Processing with FLOWGEN

After PASS1, the next pass of HAL/S-FC that's run is FLO (which is short for FLOWGEN). TBD

A Primer for Standard XPL and Intermetrics XPL/I

It is a truth universally acknowledged that there is no satisfactory introductory information available concerning programming in XPL. Or at least, it would be universally acknowledged if anybody had ever heard of XPL and wanted to use it for anything.

The best you can do, generally speaking, is to purchase a used copy of McKeeman (i.e., A Compiler Generator). If you do, you'll find a book that's densely packed with information, but that information is the source code for an XPL compiler (written in XPL), lots of BNF descriptions of the language, lots of theory on how to write a compiler-generator program, and very little of direct interest to a programmer who wants to come up to speed quickly on how to write or understand a program written in XPL. Not to mention the fact — though I am mentioning it! — that some of the most-critical counter-intuitive information is buried in easy-to-miss, easy-to-misunderstand comments made in passing, rather than as big, bold-face warnings. And as a bonus, the book provides an index of almost no use at all to a newby XPL programmer. Besides which, most online information about XPL, in my experience, is a simple abridgement or other rehashing of A Compiler Generator, and adds little extra of value in a tutorial sense, since it's almost never written by anybody actually working with XPL. With that said, you may find some useful online information in a couple of places:

Daniel Weaver provides some documentation (in particular, his xpl.pdf) with his XPL-to-C translator program.
The University of Toronto XPL website has a page with a small but useful transcription of some material from A Compiler Generator — albeit with unfortunate HTML formatting that make some of it almost unreadable.

And as for Intermetrics XPL/I ... well, from a tutorial perspective, it's orders of magnitude worse. Perhaps later, non-surviving documentation did a better job. Enough said!

Taking all of that into account, it might be reasonable to provide a full tutorial here how to write XPL or XPL/I programs. Perhaps I'll do that sometime. It turns out that that's easier said than done, since as you may have noticed, simplification for beginners is not really my personal strong suit. Which is ironic, considering my strong criticism of A Compiler Generator above! For now, I'll just cover some of the basics and quirks of the language(s). Send in suggestions for improvement, if you like; I'm sure I can use them somehow to make the discussion even worse.

The Basics

Character Set and Modern Character Substitutions

The most basic characteristic of a language is the character set in which the language is expressed. Neither McKeeman nor Intermetrics specifies the character set. I've given it a lot of thought, and my conclusion is that the originally-supported character set was:

Here, by <space> I mean a space character, and by <eof> I mean an end-of-file character.

You'll notice that several characters common today were not originally supported, including:

` ~ ^ \

Contrariwise, there are three characters (¢, ¬, and <eof>) that don't exist at all in the 7-bit ASCII character set that's the common core for the character sets typically used today when writing software source code. Therefore, when working with XCOM-I, we use the fortunately otherwise-unused ASCII characters in place of the unsupported characters:

The ASCII characters ~ and ^ are used interchangeably with ¬, but XCOM-I prefers ~.
Similarly, we use the ASCII character ` in preference to ¢.
Finally, the ASCII \ character is now used in place of <eof> (EBCDIC code 0xFE), which was a character that could be punched on the punch cards, but had no printable representation.

Aside: The <eof> character — not to be confused with the EOF token that appears at the end of the source code in an XPL or XPL/I program — is an unusual case. It is not a special character in XPL programs. However, it is used as a bookkeeping device by HAL/S compilers, or at least by HAL/S-FC, in padding that the HAL/S compiler automatically appends to a stream of input HAL/S source code. Thus we have to have a way of representing it in the XCOM-I framework, even though it's not a character anyone would ever have occasion to use in either an XPL or HAL/S source-code file, unless that file was itself part of a HAL/S compiler! To the best of my knowledge, <eof> appears only within a single string variable, INPUT_PAD, that occurs in the HAL/S-FC source-code file STREAM.xpl. As far as I can tell, it is not even a standard EBCDIC character, but rather is something chosen for some reason unknown to me by HAL/S developers. See the comments for INPUT_PAD in STREAM.xpl.

These substitutions allow us to completely translate the original XPL or XPL/I source code back-and-forth between the original EBCDIC and ACII without any loss of information, and without insisting that you adopt any specific "locale" like ISO-8859-15 or UTF-8 for your computer that's not optimal for your other (non-XPL'ing) activities. With that said, I hope that you could use ¢ and ¬ in your XPL/I source code, if you insist on doing so, but I do not guarantee it.

Aside: All previously-existing XPL or XPL/I source code I've found, or HAL/S source code for that matter, has already substituted ~ or ^ for ¬ anyway. And indeed, some early HAL/S documentation suggests this very substitution. I suspect that's because some IBM printers at the time printed ~ in place of ¬. But whatever the explanation, the substitutions I'm suggesting are not exactly daring in their originality. As far as the ¢ symbol is concerned, it's not actually used in active XPL or XPL/I code, as such, but can be used in program comments to toggle various compiler options on and off, or in principle could appear within quoted strings.

Case Sensitivity

XPL programs are not case-sensitive, except in so far as the contents of quoted strings are concerned. E.g., lower-case or mixed-case symbols are treated as being identical to their upper-case correspondents, but quoted strings are case-sensitive.

Source-Code Formatting

Input to an XPL or XPL/I program (via the INPUT built-in function) is expected to conform to computer punch-card-like conventions. I.e., input lines are always exactly 80 characters long, and XCOM-I enforces this by truncating or right-padding input lines as necessary. If the input lines are longer than 80 columns physically — say, because they have punch-card sequence numbers in columns 81-88 — the extra columns are stripped off.

As for the source-code proper, other than being confined to columns 1-80, it is entirely free-form. I.e., line breaks are ignored; several statements may exist on the same input card, or conversely, a single statement may be split across multiple cards. Even though XPL CHARACTER strings are limited to 256 characters, there is seemingly no limit to the length of an XPL statement.

There are hints in the error messages of McKeeman's XCOM (and I believe, in A Compiler Generator text) that some contemporary computer systems may have treated column 1 specially, perhaps accepting some kind of non-blank control characters there. If so, it was a issue outside XCOM proper and is irrelevant to XCOM-I. However, you do find that legacy XPL source code does often begin in column 2 rather than column 1, and I suspect that this hypothetical column-sensitivity is the reason for it.

Regarding this column-1 ambiguity, however, there is the practical question of what to do when a long quoted string is split across multiple cards. Does column 1 belong to the string or not? In XCOM-I, I take my cue from McKeeman's source code for XCOM in this matter: In spite of the fact that XCOM's source code generally avoids column 1 in all other cases, column 1 does belong to any multi-line quoted strings.

Identifiers

Identifiers cannot begin with a digit, but otherwise consist of any sequence of upper- or lower-case alphanumeric characters, or any of the characters @ _ # $. For example, @_26$8ab# is a legal name for a variable.

Identifiers cannot exceed 256 characters in length.

Datatypes, Declarations, and Literals

The Basic XPL Datatypes

There are only three basic datatypes:

FIXED is a 32-bit signed integer. (Stored as 2's-complement, in big-endian byte order, vs the little-endian byte order used in most personal computers today.)
CHARACTER is a variable-width character string, with a string-length limited to 256 or less. Strings are stored as a 32-bit unsigned integer known as a descriptor, paired with a separate area from 1 to 256 bytes containing the individual characters of the string, encoded in EBCDIC. The descriptor has 8 bits specifying the string length (minus 1) and 24-bits providing the starting memory-address of the character data.
BIT(n), where n is from 1 to 2048, is an n-bit object. The amount of storage varies by the precision:

BIT(1) through B(8) are stored in memory as single bytes.
BIT(9) through BIT(16) are stored as 2-byte "half-words".
BIT(17) through BIT(32) are stored as 32-bit words.
BIT(33) through BIT(2048) are stored similarly to CHARACTER variables: There's a 32-bit "descriptor", of which 8 bits is the number of bytes needed to store all of the bits, minus 1, and 24 bits area pointer to elsewhere in memory, where the bytes themselves are stored. Thus, a long BIT(n) like this uses up 4 bytes for the descriptor, plus ⌊(n+7)/8⌋ bytes (5 for n=33 through 256 for n=2048) for the data.

Aside: The method for storing CHARACTER data described above leaves no room for 0-length "empty" strings. But the XPL and XPL/I languages do allow for empty strings: p. 207 of McKeeman tells us that an empty-string is represented by a string-descriptor with the value 0x00000000, with no extra memory allocation for the non-existent "data" of the string. This isn't ambiguous, by the way. While 0x00000000 technically appears to be a descriptor for a 1-byte string whose data is located at address 0x000000, in fact address 0x000000 would always have been outside of the block of memory dedicated for storage of EBCDIC string data, rendering a descriptor of 0x00000000 unusable under the normal interpretation.

Regarding the BIT(n) datatypes, you might suppose that since one has gone to all the trouble to define separate datatypes like BIT(1), BIT(2), ..., BIT(8), that there is some distinction between them. That they behave differently in some way. That there's some reason for making such a distinction. You would be cruelly mistaken, as I was. There is literally no difference between these 8 datatypes. Nor is there any distinction between the 8 datatypes BIT(9) through BIT(16). Nor is there any distinction between the 16 datatypes BIT(17) through BIT(32). The distinctions between the BIT(n) datatypes for n>32 are left as a thought experiment for the reader. Thus we're left with seemingly ridiculous code, like the following from PASS1 of HAL/S-FC:

DECLARE COMPILING BIT(1);
...
IF (COMPILING&"80")^=0 THEN
    ...

Or in other words, first you declare COMPILING to have a single bit, and then later you check it to see what the value of its 8th bit is! This particular bit of hilarity caused me months worth of trouble.

The storage formats in memory duplicate those that would have been expected on an IBM System/360 computer, within the limits of my ability to infer what those formats were. While the storage formats are not significant in abstract terms, they'll be seen to be quite significant in dealing with certain aspects of HAL/S-FC's source code, such as its so-called "virtual memory" system, and indeed I think it would be impossible to run HAL/S-FC unless these underlying IBM 360 storage formats were used.

You'll notice that there is no floating-point datatype, a fact which will be discussed in some detail later.

The three basic datatypes can also be incorporated into single-dimension arrays. Multi-dimensional arrays are not allowed. There is no maximum array length beyond the available memory. Array indices start at 0. Indices (or "subscripts") are enclosed in parentheses, as in A(3) or B(N).

Aside: It's easy to become confused and to imagine (incorrectly!) that you can treat a CHARACTER variable (as opposed to an array of CHARACTER variables) as an array itself, in order to access its individual characters. You cannot! In fact, the XPL language does not provide any syntactical means to access individual characters of a string. For that, you must rely on built-in functions provided by the runtime library. The most-direct method is to use the BYTE function, which can either retrieve the EBCDIC numerical encoding of an individual character in a CHARACTER variable, or else to store a new EBCDIC numerical value at a given position in a CHARACTER variable. Thus if we had a CHARACTER variable C which held the value 'HELLO!', then BYTE(C, 3) would return 211 (the EBCDIC encoding for the letter 'L'), while the assignment statement "BYTE(C, 3) = 198;" would change the contents of C to 'HELFO' since 198 is the EBCDIC code for the letter 'F'. That sounds cumbersome, since very few of us have memorized the EBCDIC table, but it's really not. You generally don't have to look up the EBCDIC encoding for anything, because you would actually have programmed operations such as this as "BYTE(C, 3) = BYTE('F');". Another, less-generally-useful method would be to use the built-in SUBSTR function to retrieve a specific character position as a new CHARACTER object of length 1.

The BIT(N) datatype is actually quite problematic. It would be reasonable to assume that since BIT(N) seems to represent support for collections of bits, then XPL should provide some syntactical sugar for reading the values of these bits or modifying them. For BIT(1) through BIT(32), you'd be somewhat correct, in that XPL largely treats these as being interchangeable with integers (i.e., FIXED), and automatically converts them back and forth between FIXED. The runtime library's built-in logical-shift-left and logical-shift-right functions (SHL and SHR) work just as easily with them as with the FIXED, as do the logical operators &, |, and ¬ (~), and the relational operators <, >, =, !=, >=, and <=. Swell!

But once you advance to BIT(32) through BIT(2048), you are cruelly disappointed. XPL provides no built-in methods of working with this data, beyond the ability to initialize BIT(N) variables with data when they're declared. (More on that later.) As far as actually using this data for anything, there are no built-in means to do anything at all with that data. While you can manipulate the data by cobbling together various runtime-library memory-access functions, McKeeman gives none of the technical information about the storage format that would allow you to do so, in so far as I was able to discover. What do I mean about cobbling together memory-access functions? Well, suppose that you want to access bit 43 of a BIT(86) variable called B. First, you must know where the data for B is located in memory. There's a function that can tell you that. Then you must know which byte in that block contains bit 43. There's no function that tells you that, but if you do know it somehow then there's a function that gives you the value of a byte at that address. Then you must know which bit in the byte corresponds to bit 43. Again, there's no function for that, but if you know it, then you can use library functions like SHL or SHR, probably in conjunction with a logical operator like &, to isolate the value of that bit. If you wanted to change the value of that bit, different but similar awfulness is involved.

But doing any of that requires that you have intimate knowledge of how such BIT data is packed into memory. It's hard to write unambiguously about these matters, but I'll try to do so using the following concepts: An n-bit value, when written out in human-readable form has a leftmost bit and a rightmost bit; meanwhile, a block of m bytes in memory has a byte that's at the lowest address, a byte at the highest address, and within each byte has a most-significant bit and a least-significant bit. With those ideas in mind, here's how XPL packs BIT data into memory:

BIT(1) through BIT(32): The rightmost bit corresponds to the least-significant bit in the byte at the highest address.
BIT(32) through BIT(2048): The leftmost bit corresponds to the most-significant bit in the byte at the lowest address.

Or to put it concisely if ambiguously, short bit strings are right justified, while long bit strings are left justified.

Aside: Since I say that the bit-packing is undocumented, what's my justification for claiming that what I said just above is true? The short answer is trial-and-error! The longer answer is that one of the intermediate milestones in trying to get XCOM-I to the point of being able to compile the original source code for HAL/S-FC was first to be able to compile McKeeman's original XCOM and run it with a verifiably correct result. But I couldn't get it to work! After messing with it for days on end, I eventually got the answer in a dream, and then experimented with a couple of different bit-packing schemes before finally getting XCOM to run properly. The packing scheme I describe above is the one that worked.

Aside: If all that wasn't bad enough, there's also a trap waiting for you if you're already used to doing bit manipulations with logical operators and shifts in other computer languages. This trap is in the behavior of conditional tests in XPL's IF and DO WHILE statements. Conditional tests in these statements depend only on the least significant bit; i.e., it as if any conditional test involves an extra "& 1" operation that you can't see. Thus if you wanted to detect (say) that bit 3 of the BIT(5) variable A was set, a statement like "IF A & 8 THEN ...;" wouldn't help you at all, since the implicit "& 1" in the conditional would cause the test always to fail! You would instead need to use a shift-right operation, such as "IF SHR(A, 3) THEN ...;".

But enough of these measly BIT-based frustrations!

Variables in general are supposedly strictly typed, and every variable used must have an associated declaration statement, though we've already seen examples of the casual way XPL/I treats XPL's strict typing with subscripting sloppiness.

Here are a few examples of declarations of variables, both scalar and array:

DECLARE F FIXED, C CHARACTER, B BIT(5);
DECLARE FS(10) FIXED, CS(10) CHARACTER, BS(10) BIT(5);

These are pretty self-explanatory in most ways, so I won't dissect them for you in detail. What's perhaps most confusing is that the three arrays declared here (FS, CS, and BS) each have eleven elements in them, because the number 10 in their declarations is not the number of elements, but rather the highest legal index. As mentioned above, indices start at 0, so the total number of elements in each is 11.

Aside: Standard XPL, à la McKeenan, doesn't allow expressions when expressing array sizes, whereas XPL/I does allow them. For example, the following is fine in XPL/I but is a no-no in XPL:
DECLARE BUFFER(3600-1) BIT(8);
This example will be continued in the next section, where it will make a little more sense, but this is actually the kind of thing you might want to do.

There are additional attributes which can be applied to such declarations, of which the most important is probably INITIAL. This attribute allows you to supply an initial value for the variable, such as:

DECLARE F FIXED INITIAL(22), F2 FIXED INITIAL("22"), F3 FIXED INITIAL("(8) 22");
DECLARE C CHARACTER INITIAL('Hello!');
DECLARE B BIT(5) INITIAL("(1) 10100");
DECLARE FS(10) FIXED INITIAL(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11);

It's important to understand that the initializer has its affect only at compile-time, and is not applied at run-time. That means that if you declare variables within PROCEDUREs, they're not reinitialized each time the PROCEDURE is executed.

You may have been confused by the initializers shown above for the variables F2, F3, and B, since they naively appear to be strings instead of numbers; but the naive interpretation is wrong. Which brings up the nature of literals in XPL:

CHARACTER literals — i.e., text strings — are enclosed in single-quote (') characters. If a single-quote itself must appear within the string, you use two single-quotes in succession. For example: 'I am the ''king'' of the world!'.
Integer literals for FIXED or BIT(N) have one of several forms:

A sequence of decimal digits is interpreted as a non-negative number in base-10 in the usual way. Note that a leading minus sign (-) or plus sign (+) is not part of a numeric literal! In XPL, minus signs are only operators, and thus something like -10 is not a literal for the number -10, but is instead the minus operator followed by the literal for the number 10. In most cases this is a distinction without significance, because XCOM-I (or the original XCOM) automatically tries to perform all computations that are possible at compile time. Nevertheless, this distinction does cause some arithmetically-satisfactory expressions to be syntactically illegal in XPL. For example, the expression 5 + -5 isn't legal in XPL.
A sequence of hexadecimal digits enclosed in double-quote (") characters represents a hexadecimal number. Spaces are ignored within literals like this, and hence can be added at will for improved human readability.
If a double-quoted string is preceded (within the quotes) by a parenthesized decimal number, then that number indicates the number of bits represented by each digit. XCOM-I supports only the following cases:

"(1) ..." (with digits 0-1 and spaces) is a binary number.
"(2) ..." (with digits 0-3 and spaces) is a base-4 number.
"(3) ..." (with digits 0-7 and spaces) is an octal number.

These numeric literals are the only syntactical reason that the double-quote character (") appears in XPL source code.

Another important attribute is LITERALLY. It's not strictly related to declaration of variables, even though appearing in DECLARE statements and so it's discussed in the next section instead.

XPL/I ARRAY and BASED Data ... and an Exception

XPL/I provides a separate kind of statement which can declare arrays, which syntactically differs only in that the keyword ARRAY is used in place of the keyword DECLARE, and in that it can only be used for arrays of FIXED or BIT, and not for CHARACTER.

Aside: For the original XPL/I compiler, I believe there was a distinction in the way ARRAY variables were stored in memory vs DECLARE variables. At the present time, I don't see this distinction as being operationally significant, so XCOM-I treats the two keywords identically. This is subject to change, if I discover my thinking was in error. As, unfortunately, I often do.

More significantly, XPL/I adds an additional kind of datatype that it calls a BASED variable. These are basically pointers to arrays of FIXED, BIT, or CHARACTER. Note that I said "pointers to arrays" rather than "arrays of pointers". By changing the address stored in the BASED variable's pointer, you can instantly interpret an entirely different chunk of memory as the array. Moreover, besides the basic types just mentioned, the BASED variable can point to an array of "records", where each "record" is a collection of the basic datatypes. I.e., a record can hold any combination of FIXED, BIT, or CHARACTER fields, or arrays thereof. Using the BASED mechanism, XPL/I can thus mimic both pointers and primitive types of structures, neither of which is available in XPL proper. I say that the structures are "primitive", because BASED variables cannot themselves be fields of BASED variables, hence only structures that are a single-level deep are available.

Here are a couple of examples of declarations of BASED variables:

BASED FB FIXED;
BASED RB RECORD:
    F FIXED,
    C CHARACTER,
    A(10) BIT(5),
END;

Although BASED variables are always (or almost always) arrays, you'll note that the declarations of FB and RB don't indicate any dimensional information. That's because no memory for them, other than for the pointer, is allocated at compile-time. Space is instead explicitly allocated at runtime by user code. Thus XCOM-I has knowledge of the size of each array entry, but not of the number of elements in the array.

"Dope Vectors"

When I said that a BASED is a "pointer to an array", I was glossing over the fact that to be useful a BASED must track a lot more information about the BASED than just its data's location in memory. In fact, a BASED is stored as a 28-byte structure plus the separately-positioned data for the array. The 28-byte structure is referred to as a "dope vector". In other words, if you had a BASED called (say) MYBASED and you executed the built-in function ADDR(MYBASED), it would return the address of the dope vector for MYBASED. As usual, the HAL/S-FC documentation and source code do not actually provide any useful facts about this setup, but various factoids can be inferred from HAL/S-FC source code, to a greater or lesser degree of confidence, and here are my feeble inferences about the fields of dope vectors. The fields with green highlighting are those of conceivable interest to an application programmer, though all of them are managed entirely transparently in most cases.

Offset 0: FIXED pointer giving the address of the actual data.

Offset 4: BIT(16) giving the size in bytes of each record.

Offset 6: BIT(16) gives the number of CHARACTER or BIT(≥32) fields in each BASED RECORD, or 1 if it's a BASED CHARACTER or BASED BIT(≥32), or 0 otherwise. The point is that it's the number of "string descriptors" associated with each element of the BASED array. This information is used by COMPACTIFY. (See below.) While I don't fully understand the calculations being performed, I'd venture the opinion that for COMPACTIFY to use this information efficiently, it's necessary for the XPL/I compiler to rearrange the fields of BASED RECORD from their declared order in such a way that all of string-descriptor fields come first in the record.

Offset 8: FIXED giving the total number of array entries for which space has been allocated.

Offset 12: FIXED giving the total number of array entries actually used so far.

Offset 16: FIXED. The dope vectors for all BASED variables for which memory has been allocated form a linked list. The global variable FIRSTRECORD gives the address of the first dope vector on the list, and this field at offset 16 in the dope vector points to the next dope vector, or 0 if there is no next one (or if space hasn't been allocated). The ordering is TBD, since I haven't traced through the code in all of its gory detail, but I believe they are ordered according to decreasing address fields (i.e., the field at offset 0), which at least initially is also the order in which the RECORD_CONST or ALLOCATE_SPACE macros (see below) were executed to allocate memory for them.

Offset 20: FIXED. It appears to me that this field supplies some properties of the BASED in the form of bit fields. It is laid out as follows:

Bit 24 indicates the BASED is "constant", which appears to mean that you cannot incrementally grow it. (The macro NEXT_ELEMENT(based) is used to increase the size of the specified based by 1 record, an operation which fails if the based is "constant".)

Bit 25 indicates the BASED is "unmoveable". If a based is "unmovable", it means that an operation like NEXT_ELEMENT(based) (see above) will succeed only if there is enough free space immediately following the allocated memory that can be "stolen". Whereas if it's not unmovable, then the based may migrate in its entirety to a newly-allocated block elsewhere and the space it originally occupied may thus be freed.

Offset 24: BIT(16) of purpose TBD. It is referred to as "global factor".

Offset 26: BIT(16) is referred to as "group factor". As far as I can see, all uses of this are commented out in HAL/S-FC, so perhaps it ended up being unused.

The XCOM-I implementation mimics this dope-vector structure, though only the fields I've highlighted in green are significant in XCOM-I ... which is fortunate, since they're the only ones I imagine I understand somewhat.

With that discussion in mind, in understanding some of the things that need to happen with BASED variables in actual XPL/I source code (and in particular, in HAL/S-FC), let's consider various space-management macros and/or procedures used:

RECORD_ALLOC(based), used in expressions, returns the number of records allocated in BASED variable based.

RECORD_USED(based), normally used on the left-hand-side in assignments, sets the number of records used so far in based. Its most-common usage is RECORD_USED(based)=RECORD_ALLOC(based), but it can also be used with something other than that on the right-hand side to truncate the array or to skip past the lowest indexes. And it can be used in expressions or conditionals, though that happens relatively seldom.

RECORD_TOP(based), as you might expect, simply returns RECORD_USED(based)-1.

ALLOCATE_SPACE(based, top) allocates enough space for based to insure that it contains at least top+1 records in total. It will fail if we already have RECORD_ALLOC(based)>0.

NEXT_ELEMENT(based) increments RECORD_USED(based) by 1, stealing the space from adjacent free memory or else reallocating and moving the entire array if necessary to do so. This can only be used if space for based had been allocated by ALLOCATE_SPACE, and will instead cause an abend if based had been allocated by RECORD_CONSTANT (see below).

RECORD_FREE(based) frees the data for based, returning the allocated space to the free pool.

RECORD_SEAL(based), RECORD_UNSEAL(based): Enables or disables the "constant" attribute of the based.

RECORD_CONSTANT(based, top, moveable) Like ALLOCATE_SPACE(based, top), but additionally enables the "constant" property, and optionally enables the "unmovable" property.

RECORD_WIDTH(based) returns the record width of the based.

RECORD_LINK() prepares the data for transferring COMMON memory to the next program loaded.

User code that initially allocates free memory — let's say 25 records to start with — for a BASED variable is typically a two-step process that looks something like this:

BASED MYVARIABLE FIXED;
...
RECORD_CONSTANT(MYVARIABLE, 25, MOVEABLE); /* OR UNMOVEABLE */
RECORD_USED(MYVARIABLE) = RECORD_ALLOC(MYVARIABLE);

Or, if you knew that you were going to need more elements later, you might allocate a bit extra, for example:

BASED MYBASED FIXED;
...
ALLOCATE_SPACE(MYBASED, 30);
RECORD_USED(MYBASED) = 25;

Having allocated the space for it, you can now use MYBASED just like any other array of FIXED, such as in assignments like "MYBASED(27) = 6;" or "X = MYBASED(N) + 12;".

To actually increase the number of elements later, you'd do something like this:

NEXT_RECORD(MYBASED);

This will increment RECORD_USED(MYBASED) by 1 — if possible while still keeping it below RECORD_ALLOC(MYBASED) — or else will reallocate and possibly move MYBASED into a larger space, if possible.

In a more-complex case, we might have a BASED RECORD variable:

BASED MYNEWBASED RECORD:
    F FIXED,
    C CHARACTER,
    A(9) FIXED,
END;
...
RECORD_CONSTANT(MYNEWBASED, 30, MOVEABLE);
RECORD_USED(MYNEWBASED) = 25;

Accessing MYNEWBASE requires the dotted style often used these days for accessing fields of structures or classes. Some examples include:

MYNEWBASED(6).F = 12;
MYNEWBASED(10).C = 'XPL is where it is at!';
MYNEWBASED(20).A(6) = 15;
X = MYNEWBASED(6).F;

and so on.

The Exception

There is one — count 'em, one! — exception I've found to the documented behavior I've described above. This undocumented use is seen in the IND_STACK variable found in PASS2 of HAL/S-FC. IND_STACK is declared using the following bastardized mash-up of a DECLARE statement and a BASED statement:
   DECLARE IND_STACK(STACK_SIZE) RECORD:
         I_CONST        FIXED,
         I_INX_CON      FIXED, 
         I_STRUCT_CON   FIXED,
         ...
         I_LIVREMT      BIT(8),
         I_NAMEVAR      BIT(8),
         I_STRUCT_WALK  BIT(8),
         I_AIADONE      BIT(8),
      END;
Unlike a BASED declaration, in which specifying an array size at compile time is illegal, or as IR-182-1 states, "ignored if present", the array size is indeed found in this declaration. Nor is there any runtime adjustment to the size, as would be expected with a BASED. Nonetheless, IND_STACK is subsequently accessed by the dotted, structure-type notation used only by BASED variables.

Rather than implement an entirely new but undocumented class of structure objects to handle this single instance, XCOM-I implements this case as if it were a BASED declaration followed by an appropriate RECORD_CONSTANT operation. IND_STACK is one of the very few objects stored in region 7 of the XPL memory model, and by far the largest of them.

LITERALLY and Macros

Another attribute that can appear in DECLARE statements is the LITERALLY attribute. Here's an example:

DECLARE ARRAYTOP LITERALLY '255';
DECLARE MYARRAY(ARRAYTOP) FIXED;

Notice that ARRAYTOP has no datatype assigned to it. That's because its declaration is not actually the declaration of a variable called "ARRAYTOP", but rather of a macro of that name. Wherever the identifier ARRAYTOP is encountered subsequently, it's simply replaced literally by the string 255, now unquoted:

DECLARE MYARRAY(255) FIXED;

This clarifies an example of a commonly-desirable declaration I gave in the preceding section, which in this section would be expressed as:

DECLARE RECSIZE LITERALLY '3600';
DECLARE BUFFER(RECSIZE-1) BIT(8);

As noted in the preceding section, standard XPL's grammar wouldn't allow an expression (like RECSIZE-1) in that context, so this particular convenience in making declarations is only available in XPL/I.

Macro expansions — not macro declarations (I hope!) — can be nested, so you can do things like this if you want:

DECLARE DEVICE LITERALLY '6', DECLARE OUT LITERALLY 'OUTPUT(DEVICE)';
OUT = 'My message';

This expands to

OUTPUT(6) = 'My message';

Macros can expand to portions of statements, as the ones above have, or to multiple statements, such as

DECLARE MYBLOCK LITERALLY 'DO; X=1; Y=X+3; END';
...
IF X=7 THEN;
    MYBLOCK;

which expands to:

IF X=7 THEN;
    DO; X=1; Y=X+3; END;

Macros can also have arguments. Consider the following:

DECLARE MYMAC(2) LITERALLY '%1% = %2%';

This declaration means that MYMAC has 2 arguments, and that when the macro is expanded, the first argument will replace %1% and the second argument will replace %2%. Thus "MYMAC(X, 3 * Y)" expands to "X = 3 * Y".

Warning: As with macros in any other computer language, this can quickly get out of hand. XCOM-I, for example, won't detect recursive, endlessly-expanding macros. There's also no guarantee when multiple macros are in play that XCOM-I will necessarily expand macros in the same order that XCOM would have. Neither McKeeman nor Intermetrics documentation makes any mention of what that ordering should be.

Warning: The scope of macro definitions is also different in XPL vs XPL/I. In XPL, macro definitions don't respect any nested scopes they appear in; i.e., any macro definition will simply remain in effect until the end of the source code. In XPL/I, macro definitions remain in effect only until the end of the procedure in which they're defined, including embedded procedures. In neither case does a macro definition have any effect on source code prior to it.

Perplexing Multiple Assignments

XPL allows multiple variables to be assigned the same value in a single assignment statement, with a syntax like:

X1, X2, ..., XN = Y;

According to McKeeman (p. 137), these assignments are performed in right-to-left order. In other words, it should be equivalent to:

XN = Y;
.
.
.
X1 = Y;

In most cases, the ordering of these assignments is of no significance, and in fact I've found no legacy standard XPL programs in which the ordering matters. However, there are instances in XPL/I code, specifically in PASS1 of HAL/S-FC in which the order matters a lot. That's when you have assignments of the form:

Y(I), I = J;

because obviously

I = J;
Y(I) = J;

is a lot different than

Y(I) = J;
I = J;

Unfortunately, in the cases I've encountered, the XPL/I code seems to rely on the latter interpretation. I.e., it seems to believe that the assignments are performed in left-to-right order, in direct contradiction to McKeeman. What's going on here?

Of course, we have no way of knowing what Intermetrics XCOM did with this, but as far as A Compiler Generator's XCOM, it appears (thanks to Dan Weaver for this explanation) that while the assignments are indeed performed in a right-to-left manner as documented, the peculiarities of the IBM 360 object-code generation in McKeeman XCOM, shove the index I into a CPU register and reuse it without change throughout both assignments. So seemingly by accident, the net result is that the assignments appear to have been done in a left-to-right order.

I can only regard this a bug in McKeeman XCOM. Plus, it's not reliable: what if there are more than 2 assignments on the left-hand side, and if they don't all use the same index I? What would the order appear to be then?

As a result of all that:

Important Note: XCOM-I ignores the statement in A Compiler Generator that assignments are performed in a right-to-left manner. Rather, they are performed in a left-to-right manner, and any indices of arrays are computed at the moment the assignment is performed.

Logical Expressions

XPL's logical operators are &, |, and ~ (¬, ^), for "and", "or", and "not", respectively. The documentation in A Compiler Generator is maddeningly unclear as to what these operators do. True, table 6.8.1 calls them "logical and", "logical or", and "logical complement", but the word "logical" isn't defined ... just as my sloppy usage of the word "logical" at the beginning of this paragraph makes no distinction. Which leaves open a few loopholes that have to be closed up. The issues which we must understand are:

Are these operations bitwise or narrowly "logical". I.e., do they operate in parallel on each bit position in a numerical value, or do they simply produce bipolar results of 1 (true) or 0 (false)?
Do these operations short circuit? E.g., if we have an expression of the form (say) expression1&expression2, and expression1 evaluates to 0, is expression2 even evaluated?

In case you're not in the mood for a technical discussion of the matter, I'll give you the short answer up front, and having read that, you can proceed to the long discussion if you want:

The logical operations are indeed bitwise.
Under some limited circumstances, the original XCOM may have short-circuited evaluation of logical operations, even though there's no short-circuiting observed in the examples I've constructed. XCOM-I does not at present attempt any short-circuiting.

As for how I came up with this "information", there are several places we can look for guidance in guessing the answers. For one thing, according to McKeeman's account, the XPL language was derived from the PL/I language, so we can look at PL/I documentation and hope that it applies to XPL. Of course, we can look at the source code for McKeeman's XPL compiler (XCOM) as listed in the book, and see if there are any hints there. Or we can even examine the IBM 360 object code that XCOM generates for these operators. (All the while wondering how things came to this, that we have to resort to lame measures like consulting object code to figure out the basic features of the language?)

As far as PL/I is concerned, IBM's PL/I Language Reference (2017) tells us on p. 66 that for the &, |, and ¬ operators, "bit operations are performed on a bit-by-bit basis". As far as object code produced by XCOM is concerned, McKeeman (p. 150) shows an example in which object code for the expression "SHL(K,1) & SHR(I,J)" is produced, and we do find that it simply uses the IBM 360 NR ("And Logical") instruction:

Although I had to consult more than one IBM assembly-language manual to find the answer to the seemingly-simple question of what NR does, IBM's z/Architecture Principles of Operation (p. A-8) does tell us that the NR (and its cousins N, NC, and NI) are indeed bitwise operations.

Aside: Figure B-2, "Instructions Arranged by Mnemonic" of the latter document is very helpful in trying to decipher such listings of IBM 360 object code.

Short-circuiting is a natural consideration for strictly bipolar operands and operators, but is a bit trickier to consider once we've concluded that the logical operators operate bitwise rather than in a bipolar fashion. Certainly the object-code example from A Compiler Generator that was mentioned in the preceding paragraph shows no signs at all of short circuiting: Both of the operands of the & operator in that example are evaluated, with no attempt at checking the value of the first operation before proceeding to the second one. On the other hand, that example of object-code generation by XCOM happens to be for an assignment statement rather than for the conditional expression of an IF, DO WHILE, or DO UNTIL. Perhaps the evaluation of a conditional expression might be very different in those contexts. One reason to believe that it might be different is that the final result of a conditional expression is masked to just the least-significant bit, and thus (eventually) is indeed a bipolar value; i.e., even if all of the bits were involved in the computation, all but one of them is discarded in the end, so perhaps the extra bits are discarded at the beginning rather than at the end of the computation, even though it's more work to do so. Moreover, the PL/I Language Reference document mentioned earlier does cover short-circuit evaluation (see p. 245), and it says that short-circuiting is only in the context of the conditional of an IF statement (versus assignment statements). Plus, even then the short-circuiting occurs only in certain special circumstances, such as the leading operand being a BIT(1) literal or constant variable, which leads one to believe that the value of the leading operand has to be determined to be 0 or 1 at compile-time rather than at run-time for the short-circuiting to occur.

Unfortunately, the example of object-code generation in McKeeman doesn't show us how an IF statement would compile. But as we saw earlier, we have been able to use XCOM-I to create a working copy of McKeeman's XCOM, so we can make our own example of IF, compile it with XCOM and see! Imagine we have the following ridiculous little XPL program:

 DECLARE I FIXED;
 DO I = 1 TO 10;
     IF (I * I) & (100 - I * I) THEN OUTPUT = 'hello';
 END;
 DO I = 1 TO 10;
     IF 0 & (100 - I * I) THEN OUTPUT = 'hello';
 END;
EOF

Aside: By the way, compiling an XPL program with McKeeman's XCOM is a bit more constrained than compiling a similar program with XCOM-I. For one thing, to avoid an irritating if harmless warning message, the EOF token must be present at the end of the source code, whereas XCOM-I doesn't care. For another, even though XPL is case-insensitive other than inside of quoted strings, and even though all of the XPL source code in A Compiler Generator is printed in lower case, XCOM will in fact choke on any XPL source code that isn't fully upper case. Go figure!

Compiling this silly program with XCOM, and pulling just the relevant portion of the XCOM's report gives us the following IBM 360 object code for the conditional expressions of the IF statements:

  20 |    IF (I * I) & (100 - I * I) THEN OUTPUT = 'hello';                           |  1314 C7 = 10.
                                                                      1314: CODE = L    1,1340(0,11)
                                                                      1318: CODE = M    0,1340(0,11)
                                                                      1322: CODE = L    2,1340(0,11)
                                                                      1326: CODE = L    3,1340(0,11)
                                                                      1330: CODE = MR   2,2
                                                                      1332: CODE = LA   2,100(0,0)
                                                                      1336: CODE = SR   2,3
                                                                      1338: CODE = NR   1,2
                                                                      1340: CODE = N    1,164(0,11)
                                                                      ...
 24 |    IF 0 & (100 - I * I) THEN OUTPUT = 'hello';                                 |  1400
                                                                      1400: CODE = L    1,1340(0,11)
                                                                      1404: CODE = M    0,1340(0,11)
                                                                      1408: CODE = LA   2,100(0,0)
                                                                      1412: CODE = SR   2,1
                                                                      1414: CODE = N    2,1300(0,1048571)
                                                                      1418: CODE = N    2,164(0,11)
                                                                      ...

Not shown above is that the symbol table tells us variable I is stored at address 1340(11), which is why all of the 1340(0,11)'s appear above. I don't understand IBM 360 assembly language, but what I think the code probably does is:

The first IF:

1314 through 1318: Compute the left-hand operand of the & operator; i.e., I*I.
1322 through 1336: Compute the right-hand operand of the & operator; i.e., 100-I**.
1338: Perform a bitwise-AND of the two operands.
1340: Mask off all but the least-significant bit.

The second IF:

1400-1412: Compute the right-hand operand of the & operator; i.e., I*I.
1414: Perform a bitwise-AND with 0. Admittedly, I'm not quite sure why it would be 0 it's AND'ing with, but it's certainly AND'ing with something.
1418: Mask off all but the least-significant bit.

But whether or not my interpretation is 100% correct, at least in this example there's no evidence of short-circuiting. The 2nd IF in particular is pretty shocking. Perhaps there's supposed to be some subsequent optimization I'm not aware of that would have cleaned it up.

COMMON Memory

XPL/I also introduces the notion of COMMON memory, not present in XPL. The notion behind COMMON memory is that a very large application program like HAL/S-FC won't be loaded entirely in memory at once, but will instead be run as a sequence of "passes". COMMON memory is not supported in standard XPL, and compiling a program with XCOM-I's --xpl switch will disable the COMMON memory features described in this sections.

Thus, HAL/S-FC isn't a single application program, but rather a set of them: PASS1, FLO, OPT, AUX, PASS2, PASS3, and PASS4. Each of these applications is loaded, run, and unloaded from memory, in succession.

But! Each of these application programs may receive some kind of input data or state data from the preceding application program, and transmit output data or state data to the next application program in succession. In XPL/I's visualization, some of that data is passed in files. But other data is instead just assumed to remain in computer memory, unchanged from whatever the preceding application has left behind. The term XPL/I applies to this leftover memory is COMMON memory. It's formalized when you explicitly declare variables as being in COMMON. Variables declared to be in COMMON are not initialized by an XPL/I program, except for the very first program in a chain of programs, but are simply assumed to already contain the data needed. On the other hand, variables not declared COMMON are up for grabs, and no assumption can be made about their initial contents other than whatever initialization their declarations explicitly provide.

Syntactically, COMMON data is declared in XPL/I by three methods:

Using the keyword COMMON in place of the keyword DECLARE.
Using the keyword-pair COMMON ARRAY in place of the keyword ARRAY.
Using the keyword-pair COMMON BASED in place of the keyword BASED.

Another distinction is that CHARACTER variables cannot declared in COMMON, though CHARACTER variables can appear as fields in COMMON BASED RECORD variables. XCOM-I actually relaxes this restriction.

It's not documented anywhere, as far as I know, but I would assume that there was originally an expectation that each cooperating application running in succession needed to declare COMMON in exactly the same way, using exactly the same ordering of variables and the same datatypes. XCOM-I relaxes this restriction to a certain extent, but the data of BASED variables always resides at the same memory addresses in adjacent programs of a chain. When "dope vectors" of BASED variables were discussed earlier, it was mentioned that BASED variables could be set as "constant" or "unmoveable", and this necessity for remaining at the same location when successor programs are executed is part of the reason for this feature.

Aside: If BASED variables were always manipulated by XPL/I program as intended by the original compiler design — i.e., allowing SPACELIB to manipulate them via their dope vectors — there would be no reason for these restrictions on moveability. However, the "virtual memory" system employed by HAL/S-FC, as implemented by the files HALINCL/VMEMx.xpl, bypasses the dope-vector system. Specifically, "pages" of memory managed by the virtual-memory system are tracked only by an array of addresses (rather than dope vectors). But SPACELIB manages dope vectors, and has no cognizance of arrays of addresses whose interpretation exists only in the mind of a programmer. It is therefore necessary to make sure that the virtual-memory system's pages of memory never move. One might ask the programmer why they didn't use arrays of dope vectors rather than arrays of addresses? But that might be interpreted as being rude. On the other hand, I've known enough engineers to be pretty sure they'd think that since it worked for them, that was good enough.

Aside: Speaking of the relationship between BASED variables and COMMON memory, it's important to note the following points:

For BASED variables appearing in COMMON, space is allocated for them via RECORD_CONSTANT or ALLOCATE_SPACE only in the first of the chained programs that needs to use them. Subsequent programs in the chain use them as-is, without allocation.

Memory for BASED variables not appearing in COMMON must be explicitly freed via RECORD_FREE prior to exiting whatever program of the chain allocated their memory. If this is not done, then the memory-management system (Intermetrics SPACELIB) will abort the program with an error and fail to prepare the COMMON data for use by the next program in the chain.

Of course, XCOM-I makes no effort at all to pass COMMON data from one application to another using actual memory. Rather, each XPL/I application program compiled by XCOM-I can optionally (depending on its command-line options) load a file containing data into its COMMON areas of memory; and similarly, it automatically writes out its COMMON areas of memory into a file upon termination. By using the --commoni and --commono command-line switches of the application, a close degree of control can be exercised over which previously-saved COMMON blocks, if any, are passed to which application programs. The command-line options are needed because by default, an executable produced by XCOM-I does not read in a COMMON file at startup, and outputs a file literally called "COMMON.out" upon termination. The command-line options override either or both of those defaults.

COMMON files are in a human-readable format. See the comments for the writeEntryCommon function in the runtime-library file runtimeC.c if you have any interest in the file format. It's actually pretty useful for debugging certain kinds of problems.

Memory Model for a Compiled XPL Program

The theoretical memory space available at runtime for a compiled XPL program is 2²⁴=16,777,216 bytes in size, although the Wikipedia article on IBM System/360 tells us that the actual physical maximum was only 8 MB. In the C object code generated by XCOM-I, this is represented by the byte array called memory.

For the original XPL and XPL/I compilers, the lowest portion and highest portion of this (theoretical) 16 MB space was dedicated to the executable code for the program and the data used by it. The middle of the area was used for the program's data. But in the XCOM-I framework, all of the program code is stored elsewhere, thus the entire 16MB space can be dedicated just for the variables actually DECLARE'd in the XPL source code ... plus those few elements of data which the operating system needs to communicate to the program, such as run-time program options, and a few other items I'll talk about in a moment.

The memory nevertheless still needs to be partitioned into blocks dedicated to specific types of data, in order to facilitate management of dynamic data like CHARACTER or BIT(n>32) strings and BASED variables. I try to follow the same partitioning scheme as used by the original compilers, at least roughly. The broad outline of this partitioning scheme is seen in the diagram to the right.

Regions 4 and 6 naturally did not exist in XPL, since BASED variables are an XPL/I extension to standard XPL. So when XCOM-I is run with the --xpl command-line switch, those regions both have fixed sizes of 0.

Mostly, the boundaries of these memory regions are established at compile-time, by which I mean they're established by XCOM-I and don't change thereafter. The exceptions are the boundaries between regions 5, 6, and 7. As you may recall, at program startup, BASED variables have already been DECLARE'd and XCOM-I has thus made space for their dope vectors (region 4), but considered as arrays their sizes are 0. At program start, region 6 is empty. But memory region 6 grows downward when BASED variables are allocated memory at runtime, simultaneously shrinking the free-string memory region 5. String data meanwhile occupies the space from FREEBASE to FREEPOINT in region 5, which means that the space between FREEPOINT and FREELIMIT is available both for growing the string data upward (thus increasing FREEPOINT) or growing the BASED data downward (thus decreasing FREELIMIT).

Aside: It would seem reasonable, at least to me, that if BASED allocations proceed downward in memory, then the indices of the BASED should also proceed downward as well. I mean that if we had a decoration such as BASED MYBASED, then MYBASED(0) would be at a higher address than MYBASED(1), which would be at a higher address than MYBASED(2), and so on. If that were the case, then growing MYBASED after its initial allocation would be more efficient (if it happened to border on unallocated space), since it just involves snatching an unallocate chunk of memory rather than moving the entire array. However, if I am reading the code in SPACELIB correctly this does not appear to be the case: Indices of BASED variables seem to increase upward in memory, just as they do in any other XPL/I array.

Region 7 is a bit trickier to explain. It is an area dedicated to data which XCOM-I wants to use for XPL objects it creates at compile time that need to be accessible to the XPL program as normal XPL variables, but that need to be exempt from the normal memory-management procedures applied to regions 5 and 6. These are items which the original XPL compilers would have stored in regions 1 through 6, but that the different implementation of XCOM-I would have made difficult. XCOM-I allocates this area and the objects in it at compile-time. Though shown as beginning at 0xFFE000, that's misleading. The 0xFFE000 limit is the default, and is appropriate for compiling HAL/S-FC, but in fact is controlled by XCOM-I's --reserved=N (default 8192) command-line option. There will be compile-time messages if the size of the reserved memory needs to be increased. It's important to note that all XPL programs sharing the same COMMON memory need to have the same size for region 7. Specifically, all passes of HAL/S-FC need to have the same size of reserved memory, and that's why this particular default value for it was chosen.

Aside: In fact, the use of reserved memory has proven to be extremely rare. The only instances of use, to date, have been in PASS2 and PASS4 of HAL/S-FC, each of which stores a single object declared in a previously-unencountered, idiosyncratic way.

**Memory Layout**
Addresses	Description	Region Number
`0xFFFFFF 0xFFE000`	Reserved memory	7
	Data pointed to by "dope vectors" of `BASED`	6
`FREELIMIT`	512-byte zone	5z
`FREEPOINT FREEBASE`	Data pointed to by "string descriptors" of `CHARACTER` or `BIT(>32)`	5
	`BASED` dope vectors for non-`COMMON`	4
`DESCRIPTOR`	`CHARACTER` or `BIT(>32)` string descriptors	3
	`FIXED`, `BIT(≤32)` , and `BASED` dope vectors for non-`COMMON`	2
	`FIXED`, `BIT(≤32)` , and `BASED` dope vectors for `COMMON`	1
`0x000000`	`Data supplied by MONITOR(13) and MONITOR(23)`	0

Understanding COMPACTIFY

At runtime, changes to sizes of BASED arrays may cause those arrays to move around within memory region 6 (see the preceding section). Depending on the type of changes, this can cause "holes" of unused memory to develop in memory region 6. Similarly, operations on CHARACTER variables such as INPUT or string concatenation (||) can cause holes of unused memory to develop in memory region 5. As long as FREEPOINT is comfortably less than FREELIMIT, these holes don't cause any problem and can just be ignored. However, it may eventually become necessary to repack these memory regions to consolidate the free space and eliminate the holes.

As far as memory region 6 is concerned, that's handled transparently by the tools already discussed in the section on BASED variables earlier, and won't be discussed here. The COMPACTIFY procedure which handles this for memory region 5, unfortunately, does require some clarification, even though you typically don't need to explicitly call COMPACTIFY yourself, and can assume it will just be called automatically whenever needed.

Most importantly, while COMPACTIFY doesn't handle memory management of BASED variables, the existence of based variables does affect how COMPACTIFY operates, with the result that different versions of COMPACTIFY must be used for XPL code than for XPL/I code. But there's a quirk in XPL's implementation: While COMPACTIFY is considered a "built-in" function of the XPL runtime library, in point of fact it's not present in the XPL (or XPL/I) runtime library, and must instead be explicitly provided in the form of XPL source code.

Fortunately, that doesn't entail any effort on your part, since XCOM-I can usually figure out what to do on its own. But still, it may be helpful to understand what's going one behind the scenes, for those cases in which you need to intervene. The XCOM-I approach to the inclusion of COMPACTIFY is that prior to loading any of the XPL or XPL/I source-code files you explicitly specify, it automatically preloads a "library file", which is an XPL or XPL/I source-code file containing at least the source code for COMPACTIFY. The library file it chooses is governed by the following rules:

By default, it chooses SPACELIB.xpl, which is a duplicate of the Intermetrics file of the same name provided with the XPL/I source code for the HAL/S-FC program. It should be good for compiling all XPL/I source code.
But if the XCOM-I command-line switch --xpl is used, the default library changes to XPL.LIBRARY.xpl. This a duplicate of the library of the same name provided with the source code of McKeeman's XCOM program. It should be good for compiling all standard XPL source code.
But if the XCOM-I command-line switch --lib-file=FILENAME is used, then FILENAME is used in place of the default library file. This gives you the option of using some other version of COMPACTIFY, perhaps experimenting with it yourself. If used along with an --xpl switch, then --lib-file must follow --xpl on the command line.

That's all you really need to know, and perhaps more. Unfortunately, I need to know more about it, in order to make it work in XCOM-I, and here's as good a place as any to write down some of that need-to-know info!

The principal difficulty is that while XPL.LIBRARY.xpl works immediately for XPL program, as-is, as long as the memory partitioning described in the preceding section is compatible enough with McKeeman, the same cannot be said of SPACELIB.xpl for XPL/I programs.

In brief, here's how the XPL COMPACTIFY of XPL.LIBRARY.xpl works. All of the "string descriptors" (providing the addresses and sizes of the character strings in region 5 which need to be repacked) are found, contiguously, in memory region 3, which they completely fill. Because of this convenient arrangement, all of the string descriptors can all be accessed as if they were a single FIXED array. Standard XPL's built-in DESCRIPTOR and NDESCRIPT respectively give the location and size of this array of string descriptors. Repacking region 5 is a relatively-simple matter of sorting DESCRIPTOR array by address, using the sorted array to find the "holes", then moving the string data downward in region 5 to eliminate the holes, and finally updating the original descriptors in the DESCRIPTOR array to preserve the strings but to change the addresses.

The only mild complication in the description above is that the DESCRIPTOR array can't be sorted in place. Rather, a separate scratch array is needed to help out the process:

DECLARE DX_SIZE LITERALLY '500', DX(DX_SIZE) BIT(16);

This DX array is set up to contain the indices into the DESCRIPTOR array, and it's DX that's sorted rather than DESCRIPTOR. This is undoubtedly a good strategy, although it limits the total number of DECLARE'd CHARACTER variables to 500. On the other hand, the limit of 500 is completely arbitrary, and if you have enough memory (we do!) there's no reason it couldn't be increased as high as 64K. And by the simple expedient of declaring DX as FIXED rather than BIT(16), the limit could be made much higher.

Unfortunately, the relatively-simple processing fails for XPL/I code — or at least, for any XPL/I code having BASED variables containing CHARACTER data or BIT(≥32) data — because no longer are all of the string descriptors nicely lined up for us in memory region 3. Rather, many of them are smeared all over memory region 6. Nor is a limit of 500 strings remotely adequate for compiling programs like HAL/S-FC. Here is what I understand of the differences between how the XPL/I COMPACTIFY has to operate, as compared to the description I just gave of the XPL COMPACTIFY:

While the built-ins called DESCRIPTOR and NDESCRIPT still exist, and still provide exactly the same info about memory region 3, they are no longer relevant to COMPACTIFY.
DX is now a BASED FIXED, because it contains pointers to the string descriptors (i.e. it contains memory addresses of the string descriptors) rather than just indices into region 3.
DX_SIZE gives the number of elements of DX, but as it's nowhere DECLARE'd in the library, I suppose it must be an undocumented XPL/I built-in.

Aside: I'd venture the guess that DX_SIZE is initially set and space for DX is initially allocate by the XPL/I compiler to handle the string descriptors which are DECLARE'd and therefore known at compile time. At runtime, SPACELIB would then take over the task of maintaining DX and DX_SIZE to handle additional string descriptors that come into existence (or leave it) when BASED variables containing them grow (or shrink).

TBD

Structure of an XPL Program vs XPL/I

An XPL program consists of any sequence of XPL statements, followed by the token EOF. In particular:

Declarations of variables can be intermixed with active statements such as assignments.
Active code can exist at the global level, outside of any PROCEDUREs.

Note: The original XPL compiler, which was called XCOM, performed a single pass. It required that the declaration of any particular identifier as an object (such as a variable) had to precede the use of that identifier, although there were provisions for making a forward declaration for a PROCEDURE, so that the PROCEDURE could be used before it was defined. XCOM-I relaxes this requirement.

Each of the sample programs I've encountered in standard XPL so far has been contained in a single relatively-small file. For example, ANALYZER has a little over 1500 lines of source code, while XCOM has a little over 4200 lines.

In contrast, the XPL/I source code for Intermetrics's HAL/S compiler HAL/S-FC has over 120,000 lines of source code spread across over 600 files, though any individual pass of the compiler has no more than around 35,000 lines. This huge size, along with the huge difference from programs in standard XPL, necessitates different methods for managing that source-code base, and some of those methods are reflected by compiler directives embedded within the source code. Insofar as HAL/S-FC and its related applications are concerned, the top-level source-code file (##DRIVER.xpl) for each application always contains all of the necessary directives for compiling the other source-code files needed, in the correct order, so in using XCOM-I to compile these applications you don't need to worry about any file other than ##DRIVER.xpl itself.

Aside: Well, the comment about ##DRIVER.xpl isn't exactly right. Any XPL or XPL/I program will expect that there's a separate "library file" containing source code for the COMPACTIFY procedure, but the XPL/I source code for the program won't explicitly include the library file. That's the compiler's responsibility.

Aside: Due to the lack of relevant Intermetrics documentation, what I'm about to describe is not only speculative on my part, but also represents certain pragmatic compromises that I don't believe literally existed in Intermetrics' XPL compiler or development procedures. But if it will work for us using XCOM-I on HAL/S-FC and if there are no other lurking XPL/I programs that we need to worry about, why complain?

Compiler directives in XPL/I are comments or comment-like constructions which aren't documented in McKeeman and have no other obvious purpose. The specifics are covered by the subsections below.

Compiler Directive Type: `/?c ... XPL/I source code ... ?/`

This type of compiler directive is a conditional inclusion of source code, similar to the C language's #if c. Here, c is supposed to be an upper-case alphabetical letter that represents the particular condition that needs to be "true". All conditions, A through Z, are by default "false". You make condition c "true" by using XCOM-I's --cond=c command-line switch.

Interpretations of the possible conditions are entirely at the user's disposal; i.e., they vary by the particular source-code being compiled by XCOM-I. In the particular case of the XPL/I source code for the HAL/S-FC program, there are 4 possible conditions c that I'm aware of:

**Preprocessor Conditionals**
XCOM-I Command-Line Switch	Interpretation
`--cond=P`	HAL/S-FC will be specialized for compiling the Space Shuttle's Primary Flight Software (PFS).
`--cond=B`	HAL/S-FC will be specialized for compiling the Space Shuttle's Backup Flight Software (BFS). Note: Either `--cond=P` or `--cond=B` must be used, but not both at the same time.
`--cond=A`	Produce debugging output related to memory management of `BASED` variables.
`--cond=C`	Produces debugging output related to actions by the `COMPACTIFY` procedure.

Aside: This implies that you don't just compile HAL/S-FC once to get a HAL/S compiler that works for all HAL/S programs. Rather, you compile HAL/S-FC twice, once to get a version of the HAL/S compiler that works for the primary flight software, and once to get a version of the compiler that works for the backup flight software.

Compiler Directive Type: `/%INCLUDE module %/`

This type of compiler directive inserts an entire XPL/I source-code file, module.xpl, at the current point. It is used for importing COMMON-block declarations or macros which are used identically by all source-code files. By default, the included module is taken from the folder ../HALINCL/. As far as I know this covers every use in HAL/S-FC source code. However, if necessary, XCOM-I has a command-line option (--include=folder) which can be used to change the folder containing the modules.

Compiler Directive Type: `/* ...comment... $%module */`

This is a variant of the /%INCLUDE module ...comment... %/ directive, which acts the same way, and for which my comments are otherwise the same.

Compiler Directive Type: `/**MERGE module procedure */`

This is yet another directive for including an XPL/I source-code file in the current XPL/I source-code file, but it differs from the other include-directives described above in that instead of importing definitions used in common by multiple source-code files, it instead is typically used for importing the source code for a single PROCEDURE. As above, module.xpl is the source-code file to include, while procedure is the name of the PROCEDURE contained in that file. In point of fact, XCOM-I simply ignores the procedure name.

Aside: Procedure names don't match the filenames, usually, because the naming conventions for System/360 files were severely limited vs identifiers in XPL. Thus the filenames were normalized, truncated forms of the procedure names.

Also, module.xpl is expected to be within the same folder as the source-code file being compiled; no other folders are searched for it, and there are no command-line switches to alter this behavior.

PROCEDUREs, RETURNs, and Their Peculiarities

In a view from a height, an XPL program consists of PROCEDURE definitions and of code that uses those definitions.

A procedure definition looks something like this:

label:
PROCEDURE(... parameter list ...) ReturnType;
	DECLARE ... for the parameters ...;
	DECLARE ... for local variables ...;

	... code ...;
END label;

A lot of this is optional. Thus while the initial label: is required (since it's the name of the procedure), the label at the end of the definition is optional, and is really there only for readability purposes. If the procedure needs no parameters, then the parameter list, including its enclosing parentheses, is omitted. If the procedure returns no value, then ReturnType is omitted; if present, it is one of the basic non-subscripted datatypes FIXED, BIT(n), or CHARACTER. Each parameter in the parameter list must have a declaration within the body of the procedure, and while those declarations don't technically have to precede the declarations of the local variables as shown above, it was apparently customary to do so.

PROCEDURE definitions in the source code can be nested, to any desired depth. PROCEDURE definitions and DO...END blocks provide the program with a hierarchical structure of "scopes". The hierarchical structure is provided by the parent/child relationships among the scopes. Variables are accessible within the scope in which they're declared, along with any descendant scopes. If a variable is declared in more than one scope of the hierarchy, the one in the innermost enclosing scope is the one that is applicable.

In standard XPL, variables can be declared only at the global level or in a PROCEDURE, and not within DO...END blocks.

In XPL/I, variable (and PROCEDURE) definitions can be declared anywhere, even within DO...END blocks. XCOM-I mimics the XPL/I usage, since it's compatible with standard XPL. But there's a trap! Or at least, it's a trap for me, and by projection, I assume it's a trap for you as well. The trap is that you'd suppose the scope of a variable or PROCEDURE was the DO...END block in which it was declared. Not so! The scope is the entire enclosing PROCEDURE and its descendants. So beware. Fortunately, such abominations are very rare, and only a handful appear in the entire HAL/S-FC source-code base. Here's an example from the DUMP_SDF PROCEDURE of PASS4 of the compiler, cleaned up a little for readability. Observe the declaration and usage of the variable SYM_DATA_CELL_ADDR within the two distinct DO...END blocks:

            IF NODE_B(3) > 0 THEN DO; 
 	       /* IN CASE OF SYMBOL XREF EXTENSION CELL, SAVE ADDR OF SYM DATA CELL */
               DECLARE SYM_DATA_CELL_ADDR FIXED; 
               SYM_DATA_CELL_ADDR= COREWORD(ADDR(NODE_F)); 
               CALL PRINT_XREF_DATA(SHR(NODE_B(3),1));
            END; 

            IF ASIP_FLAG & (SCLASS=1 | SCLASS=2 & NAME_FLAG) THEN DO;
 	       /* RESTORE NODE_F TO SYM_DATA_CELL_ADDR BEFORE GETTING AUXILIARY INFO. */
               COREWORD(ADDR(NODE_F))  = SYM_DATA_CELL_ADDR;
               IF STYPE=16 & ^NAME_FLAG THEN
                  CALL FORMAT_NAME_TERM_CELLS(I,NODE_F(-2));
               ELSE OUTPUT = X10||'INITIAL(NAME('|| 
                  FORMAT_VAR_REF_CELL(NODE_F(-2)) || '))';
            END;

A PROCEDURE may be invoked in two different ways. If it returns a value via a RETURN statement, it can be used in an arithmetical expression or a string expression. If it doesn't return a value, or if it does return a value and you simply want to ignore the value, a CALL statement can be used to invoke the PROCEDURE but to discard any returned value.

Important: All variables local to a PROCEDURE definition retain their values after the PROCEDURE returns. If the PROCEDURE is re-executed, those local variables retain the values they previously had in the prior invocation of the PROCEDURE. The values of those local variables, though retained, are inaccessible to code outside of the PROCEDURE, because the compiler enforces scopes of variables. (In C code, this would be the same thing as saying that every local variable of every function is automatically declared as static.)

Very important: Any (or even all) parameters at the trailing end of the calling list of a PROCEDURE can be omitted from when calling the PROCEDURE, and if omitted, they retain the same values as the last time the PROCEDURE was invoked or the values previously assigned to those parameters from within the PROCEDURE itself! (In essence, this is like saying that parameters of a PROCEDURE are not passed to the PROCEDURE, but rather that they are just aliases for some set of global variables dedicated to the PROCEDURE.) That's so weird that we need to see an example. Consider the following XPL PROCEDURE definition, and CALLs to it:

weirdo: 
procedure(x, y, z);
    declare (x, y, z) fixed;		/* Declare x,y,z as integers */
    output = x || ' ' || y || ' ' || z; /* Print out x,y,z */
    x = 29;
    y = y + 1;
end weirdo;

call weirdo(1, 2, 3);
call weirdo(4, 5);
call weirdo(6);
call weirdo();
call weirdo;

The five calls successively print out the following:

PROCEDUREs cannot be recursive, either directly or indirectly.

Taking these facts altogether, XCOM-I implements both parameters and local variables of PROCEDUREs essentially as global variables in they way they are stored: i.e., each parameter and each local variable of each PROCEDURE has its own static address (in the global memory model), assigned at compile time and unchanging thereafter. The compiler enforces the logical scoping of these variables.

Regarding the RETURN statement, McKeeman explains that it is used to exit from a PROCEDURE and optionally to return a value. Furthermore, the calling code can either use that return value or else ignore it. Which makes perfect sense. But as usual, there are some documented and undocumented peculiarities to the RETURN statement as well:

McKeeman explains (p. 146) that calling code can use returned values from PROCEDUREs even if the PROCEDURE has no RETURN statement, or the RETURN statement specifies no value. In this case, we are told, the return value is simply some unpredictable value from some unspecified System/360 register. Which is rotten, of course, but so what? This is never going to happen, right? Wrong! Actual XPL code does this from time to time. XCOM-I, on the other hand, always returns a well-defined value from a PROCEDURE, whether or not there are any RETURN statements specifying a return value; the returned value in this case is 0 if FIXED, a BIT value of the appropriate width evaluating to 0, or else the empty string for a CHARACTER.
McKeeman (probably!) does not mention that RETURN statements may exist at the global level, outside of the scope of any procedure, and may return a value when they do. But they can. XCOM-I treats these as exits from the program back to the operating system, with the returned value being the program's exit code. It thus expects the return value to be a program status code.

Blocks and Loops

Compound statements in XPL are groupings of simple statements (such as assignments or if-then-else statements) enclosed within a DO ... END block:

DO ...;
    ... simple statements ...
END;

There are five different kinds of DO ... END blocks. First, there is a mere grouping:

DO;
    ... simple statements ...
END;

Then there are 3 different kinds of loops:

DO COUNTER = START TO END [BY STEP];
    ... simple statements ...
END;

DO WHILE CONDITION;
    ... simple statements ...
END;

DO UNTIL CONDITION;
    ... simple statements ...
END;

Note that DO UNTIL is new in XPL/I and is not present in standard XPL.

In these loops, COUNTER, START, END, and the optional STEP are all integers. STEP defaults to 1, but must be positive. START, END, and STEP may be expressions, but if so they are evaluated only a single time, at the start of the loop, and are not reevaluated thereafter. CONDITION, on the other hand, is an expression evaluated on each loop; it is treated as "true" if its least-significant bit is 1, or "false" if its least-significant bit is 0. When I say they are "integers", I don't mean that they are necessarily FIXED; they could also be BIT(≤32), which evaluate to integers.

The fifth kind of DO ... END block is:

DO CASE EXPRESSION;
    STATEMENT0;
    STATEMENT1;
    STATEMENT2;
    ...
END;

The EXPRESSION must also evaluate to an integer. If 0, then STATEMENT0 is executed; if 1, then STATEMENT1 is executed; and so on. At most, a single statement is executed, and there is no "fall through" from one statement to the next. If the EXPRESSION is negative or beyond the number of available statements, McKeeman tells us that "a random jump is executed". In XCOM-I, no statement is executed under those circumstances, and control passes to the next statement after the END.

`ESCAPE` and `REPEAT`

The ESCAPE and REPEAT keywords appear to be undocumented XPL/I features not present in standard XPL. Unfortunately, from the available material I can't think of any way to be sure what they do, so I can only speculate.

ESCAPE appears in two different forms:

ESCAPE;
ESCAPE LABEL;

It should be noted that the HAL/S language has the keyword EXIT, which also has these two forms. (See Ryer, p. 5-12.) EXIT has the following behavior in HAL/S:

EXIT; — Exits from the innermost enclosing DO ... END block. I.e., it's essentially a GO TO to just after the closest enclosing END.
EXIT LABEL; — Exits from an enclosing DO ... END that isn't necessarily the innermost one, but rather the one which instead has the specified LABEL attached to it. By "attached to it", I mean that they're directly adjacent, as in "LABEL: DO ...".

Until a more-plausible explanation comes along, my assumption is that ESCAPE in XPL/I has the same behavior as HAL/S's EXIT.

For example, consider the following XPL/I code:

...
MYBLOCK:
DO ...
    ...
    DO ...
        ...
        ESCAPE; /* Escape #1 */
        ...
        ESCAPE MYBLOCK; /* Escape #2 */
        ...
    END;
      /* Escape #1 comes here! */
    ...
END;
/* Escape #2 comes here! */
...

REPEAT also appears in those same two forms:

REPEAT;
REPEAT LABEL;

And again, there's a REPEAT keyword in HAL/S as well. In this case, though, from examining the actual usage in HAL/S-FC source code, I believe that while the keyword has a similar behavior in HAL/S and XPL/I, there is nevertheless a distinction between them. Here's what I believe the behavior is in XPL/I:

REPEAT; — "Repeats" the smallest enclosing DO ... END. In case the smallest enclosing DO ... END is a loop — DO WHILE or DO UNTIL or DO I = X TO Y — saying that it "repeats" has a pretty clear meaning: XPL/I REPEAT is like a Python or C continue statement. On the other hand, if the immediately-enclosing DO ... END is not a loop, the expected behavior is less clear; nor are there any instances of REPEAT in non-loops in legacy XPL/I code from which we might get a clue. XCOM-I implements REPEAT without a label simply as a jump to the beginning of the block, which means that you could form an infinite loop if there were no other code (like ESCAPE or GO TO) to exit the loop.
REPEAT LABEL; — Breaks out of inner loops as needed, until reaching an enclosing DO ... END loop that has the attached LABEL. That's the block that it repeats.

Aside: The way my XPL/I implementation is different from HAL/S is that in HAL/S, REPEAT (without a label) goes to the beginning of the innermost enclosing loop (DO WHILE or DO UNTIL or DO I = X TO Y) rather than the innermost enclosing DO ... END. Which makes sense, since that's what you'd normally want.

Note: Neither ESCAPE nor REPEAT accepts a label attached to an arbitrary statement. It must be a label attached to a DO ... END block, and it must at some level enclose the ESCAPE or REPEAT statement itself.

Counter Value After Normal Loop Termination

For XPL code such as

DO I = 1 to 100;
    ...
END;

it appears to be undocumented what value I holds after the loop ends. Not all computer languages handle this in the same way. My considered opinion is that the loop counter holds the value at which the condition for continuation fails: in this example, 101. Or, if the loop is broken prematurely (as with ESCAPE from the preceding section), then the value the counter held when the ESCAPE occurred is retained.

Aside: This would match the behavior of C. Whereas in Python, for example, the counter for an equivalent loop would have the value 100 rather than 101 after normal termination of the loop.

Program Options

The topic of program options is a confusing one, because there are many different programs under discussion, and all or most of them have their own sets of program options, or sometimes even their own unique ways of specifying those options. There's XCOM-I, whose program options you can discover with the command "XCOM-I.py --help". Then too, when you use XCOM-I to compile an XPL or XPL/I program, XCOM-I endows the executables it produces with a uniform set of options; if (say) you had compiled an XPL program such as MyProgram.xpl, via a command like "XCOM-I.py MyProgram.xpl", you could find out what MyProgram's uniform options were with the command "MyProgram --help".

But that tells us only about options that are the same in all programs of the XCOM-verse. What about options that influence the XPL or XPL/I program itself in program-specific ways? What options specific to MyProgram.xpl are there, and how can those options be selected? What options did (say) HAL/S-FC have originally, and how are those options applied to it now?

Originally, programs run on the IBM 360 would have been associated with Job Control Language (JCL) cards that (among other things) would have set the program options via what's called the "PARM field". For example, in these JCL "EXEC" cards,

// XPL  EXEC  PGM=MONITOR,
//            PARM='SYTSIZE=1800,REFSIZE=2000,LISTING2,$I,$V,$U,$W'
...

the PARM field is 'SYTSIZE=1800,REFSIZE=2000,LISTING2,$I,$V,$U,$W'. Of course, we have no JCL in our XCOM-I framework, but the equivalent would be the --parm command-line option, à la

MyProgram --parm='SYTSIZE=1800,REFSIZE=2000,LISTING2,$I,$V,$U,$W'

Such a PARM field consists of individual options (or parameters), separated by commas (and no spaces surrounding the commas). So in this example, the parameters being selected for MyProgram are

SYTSIZE=1800
REFSIZE=2000
LISTING2
$I
$V
$U
$W

As for how these parameters get from the PARM field to the XPL or XPL/I program itself — i.e., how do you write XPL code to access the parameters? — there is unfortunately no standardized method. In the vision of A Compiler Generator, a vision which Intermetrics shared, there's an assembly-language program called the submonitor, separate from MyProgram, that parses the PARM field to find the individual parameters and decides what to do with them. Some of the parameters may influence the submonitor itself, and never even be made available to the XPL program at all. Others may be used to set the values of variables used by the XPL program. For example, McKeeman (p. 251) suggests some "possible" parameters, and McKeeman's sample submonitor program (XMON.bal) does various things with parameters of those suggested names. Indeed, I think all of McKeemans suggested parameters affect only the submonitor, which doesn't exist at all in XCOM-I, so none of these suggested options are implemented in XCOM-I. Or at least, not supported in the same way, as you'll see some or all of them reappear in a little bit.

In other words, there's not meaningful support for program parameters for standard XPL in XCOM-I, because standard XPL itself had no standard method for dealing with them.

For XPL/I, on the other hand, Intermetrics had an elaborate set of parameters for HAL/S-FC, as well as a system for making those parameters available available to the XPL/I program. XCOM-I fully supports passing these Intermetrics parameters to the XPL/I program. Take a look at the first page of a report from a typical run of HAL/S-FC:

  HAL/S REL32V0           T H E   V I R T U A L   A G C   P R O J E C T            MAY 17, 2024    16:43:5.39     PAGE 1


   HAL/S COMPILER PHASE 1 -- VERSION OF MAY 17, 2024.  CLOCK TIME = 9:15:17.00.


TODAY IS MAY 17, 2024.  CLOCK TIME = 16:43:5.39.
 
 PARM FIELD: TITLE=T H E   V I R T U A L   A G C   P R O J E C T


 COMPLETE LIST OF COMPILE-TIME OPTIONS IN EFFECT


       *** TYPE 1 OPTIONS ***
 
        NOADDRS
        NODECK
        NODUMP
        NOHALMAT
        NOHIGHOPT
        NOLFXI
        NOLIST
        NOLISTING2
        NOLSTALL
        NOPARSE
        NOREGOPT
          SCAL
        NOSDL
        NOSREF
        NOSRN
        NOTABDMP
          TABLES
        NOTABLST
        NOVARSYM
          ZCON


       *** TYPE 2 OPTIONS ***
 
       BLOCKSUM = 400
       CARDTYPE = 
       COMPUNIT = 0
            DSR = 1
      LABELSIZE = 1200
         LINECT = 59
      LITSTRING = 2500
      MACROSIZE = 500
          PAGES = 2500
        SYMBOLS = 200
          TITLE = T H E   V I R T U A L   A G C   P R O J E C T
       XREFSIZE = 2000
 

 *** NO LANGUAGE SUBSET IN EFFECT ***

As you can see, the first thing such a typical run does is to print the PARM field as is, then then to print a list of the individual parameters in effect. Parameters are categorized as either "Type 1", which means that they are simple on/off settings, or else as "Type 2", which means that they accept numerical or string values. For example, the option allowing for production of HALMAT by HAL/S-FC's first pass is of Type 1: It is turned "on" with the parameter HALMAT or turned "off" with the parameter NOHALMAT; its default is "off" (NOHALMAT). (Don't worry about the fact that you don't know what "HALMAT" is, or why one might want to produce it.) Whereas the option that overrides the company name in the page heading is of Type 2: Its default value is an empty string, meaning "don't override" the default company name, but in this case I've overridden it via a PARM field of --parm='TITLE=T H E V I R T U A L A G C P R O J E C T'. Notice that parameters whose values are strings are not enclosed in quotes, which might mean that a string parameter such as this one cannot contain a comma.

In addition to Type 1 and Type 2, parameters are also categorized as printable or unprintable. The only distinction is that "printable" ones are printed out on the first page of the output listing, and the "unprintable" ones are not. In other words, there a lots of other parameters whose values aren't shown in the sample printout above. So there are actually 3 separate categories of parameters: Type 1 printable, Type 1 unprintable, and Type 2 printable. (While in principle there are Type 2 unprintable parameters also, there are none presently defined in any existing options processor.)

Aside: There are several exceptions too. The parameter LFXI / NOLFXI is categorized as "non-printable" and yet special code was apparently inserted into HAL/S-FC to make it print anyway; the comments in the code explicitly say so. The same section of code, on the other hand, explicitly prevents printing of the parameters QUASI and TRACE, even though they're categorized as printable.

I should note that not all of the parameters are documented ... though I suppose by now you probably shouldn't be very surprised by that. There's pretty good documentation of quite a few of them in the HAL/S-FC User's Manual (see Section 5.0), and some less-reliable documentation in IR-182 Sections 13.4 and 3.3, which I assume you'll be able to read without my predigesting it for you. (That's a lie, by the way; I just don't understand all of them well enough to regurgitate them, or else I'm sure I would!)

Aside: I will comment on one of the parameters, though, namely the TITLE parameter. As I mentioned above, it's used to change the name of the company name in the page headings of printouts. The default company name is 'T I T A N S Y S T E M S C O R P .' The program comments indicate that it was changed in March of 2002 from "Intermetrics". Why? According to Wikipedia, Intermetrics went through a sequence of mergers, acquisitions, and name changes, at one point of which was a merger with the Titan Corporation in March of 2000. So I guess that a couple of years after that, somebody was finally instructed to enshrine this change in the HAL/S-FC source code. Naturally, that only lasted a couple of years before Titan was acquired by L-3 Communications. I wonder if the name was ever updated after that? Our copy of HAL/S-FC seems to be from mid-2005, so presumably the final name change wouldn't have had a chance to percolate into the source code yet by that point.

Parameters are delivered to the XPL/I program via the MONITOR(13) function. (See here to read about the precise mechanism.) In brief, there are several "options processors" available to parse the PARM field and provide the parsed contents to the XPL/I program. These options processors are specific to the XPL/I program you're compiling. The existing options processors built into the runtime library go by the names

COMPOPT (default). [Note that this is the options processor appropriate for PASS1 of HAL/S-FC. Subsequent passes of HAL/S-FC prior to PASS4 do not use MONITOR(13), but instead receive command-line parameters from PASS1 via COMMON memory.]
LISTOPT. [Note that this is the options processor appropriate for PASS4 of HAL/S-FC.]
MONOPT

With MONITOR(13), one of those options processors is selected; there's no mixing-and-matching allowed. We'll get to the topic of what to do if none of these meets your needs in a moment.

If the XPL/I program being compiled is HAL/S-FC, these existing options processors can have different behavior depending when compiling the Primary Flight Software (PFS) vs the Backup Flight Software (BFS), vs compiling for the IBM 360 ... by which I mean that different options are supported in these cases. The following table gives very brief summary of the parameters supported by each existing options processor.

Note: In the Type 1 sections of the table, what's given are the parameters to enable the settings. The parameters to disable the settings are the same, but prefixed by "NO"; for example, LISTING2 vs NOLISTING2. Many of the parameters (both Type 1 and Type 2) also allow you to use an abbreviated nickname; for example, if you're too lazy to type LISTING2, you can instead use the nickname L2. If you want to negate a parameter nickname, you can prefix it with "N" (not "NO"); for example L2 vs NL2. In other words, LISTING2=L2 and NOLISTING2=NL2. Moreover, each parameter has a default value that's used if you don't explicitly specify a new one. In the table below, the parentheses after the parameter names give both the default values and the nicknames (if any).

**PARM-Field Options**
COMPOPT PFS	COMPOPT BFS	COMPOPT 360	LISTOPT	MONOPT
Type 1, Printable
DUMP (off, DP)	DUMP (off, DP)	DUMP (off, DP)	TABLST (off, TL)	DUMP (off, DP)
LISTING2 (off, L2)	LISTING2 (off, L2)	LISTING2 (off, L2)	TABDMP (off, TBD)	LISTING2 (off, L2)
LIST (off, L)	LIST (off, L)	LIST (off, L)	ALL (off)	ALTER (off)
TRACE (on, TR)	TRACE (on, TR)	TRACE (on, TR)	BRIEF (off)
VARSYM (off, VS)	VARSYM (off, VS)	DECK (off, D)
DECK (off, D)	DECK (off, D)	TABLES (on, TBL)
TABLES (on, TBL)	TABLES (on, TBL)	TABLST (off, TL)
TABLST (off, TL)	TABLST (off, TL)	ADDRS (off, A)
ADDRS (off, A)	ADDRS (off, A)	SRN (off)
SRN (off)	SRN (off)	SDL (off)
SDL (off)	SDL (off)	TABDMP (off, TBD)
TABDMP (off, TBD)	TABDMP (off, TBD)	ZCON (on, Z)
ZCON (on, Z)	ZCON (on, Z)	FCDATA (off, FD)
HALMAT (off, HM)	HALMAT (off, HM)	SCAL (on, SC)
REGOPT (off, R)	SCAL (on, SC)	MICROCODE (on, MC)
MICROCODE (on, MC)	MICROCODE (on, MC)	SREF (off, SR)
SREF (off, SR)	SREF (off, SR)	QUASI (off, Q)
QUASI (off, Q)	QUASI (off, Q)
TEMPLATE (off, TP)	REGOPT (off, R)
HIGHOPT (off, HO)	TEMPLATE (off, TP)
	HIGHOPT (off, HO)
Type 1, Unprintable
PARSE (off, P)	PARSE (off, P)	PARSE (off, P)	X3 (off)
LSTALL (off, LA)	LSTALL (off, LA)	LSTALL (off, LA)	X4 (off)
LFXI (on)	LFXI (on)	LFXI (on)	X5 (off)
X1 (off)	X1 (off)	X0 (off)	X6 (off)
X4 (off)	X4 (off)	X1 (off)	X7 (off)
X5 (off)	X5 (off)	X2 (off)	X8 (off)
XA (off)	XA (off)	X3 (off)	X9 (off)
X6 (off)	X6 (off)	X4 (off)	XA (off)
XB (off)	XB (off)	XA (off)	XD (off)
XC (off)	XC (off)	X6 (off)	XE (off)
XE (off)	XF (off)	XB (off)	XF (off)
XF (off)		XC (off)	XG (off)
		XE (off)	XH (off)
		XF (off)	XI (off)
			XJ (off)
			XK (off)
			XL (off)
			XM (off)
			XN (off)
			XO (off)
			XP (off)
Type 2, Printable
TITLE ("", T)	TITLE ("", T)	TITLE ("", T)	TITLE ("", T)	LINECT (59, LC)
LINECT (59, LC)	LINECT (59, LC)	LINECT (59, LC)	LINECT (59, LC)	PAGES (250, P)
PAGES (2500, P)	PAGES (2500, P)	PAGES (250, P)	PAGES (10000, P)	MIN (50000)
SYMBOLS (200, SYM)	SYMBOLS (200, SYM)	SYMBOLS (200, SYM)	LIST (1, L)	MAX (5000000)
MACROSIZE (500, MS)	MACROSIZE (500, MS)	MACROSIZE (500, MS)		FREE (14336)
LITSTRINGS (2000, LITS)	LITSTRINGS (2000, LITS)	LITSTRINGS (2000, LITS)
COMPUNIT (0, CU)	COMPUNIT (0, CU)	COMPUNIT (0, CU)
XREFSIZE (2000, XS)	XREFSIZE (2000, XS)	XREFSIZE (2000, XS)
CARDTYPE ("", CT)	CARDTYPE ("", CT)	CARDTYPE ("", CT)
LABELSIZE (1200, LBLS)	LABELSIZE (1200, LBLS)	LABELSIZE (1200, LBLS)
DSR (1)	DSR (1)	DSR (1)
BLOCKSUM (400, BS)	BLOCKSUM (400, BS)	BLOCKSUM (400, BS)
MFID ("")	OLDTPL ("", 0)

If you've been eagerly absorbing every detail I've discussed — and I fail to see any reason you wouldn't have! — then you'll recall that HAL/S-FC has seven passes (PASS1, FLO, OPT, and so on), each of which is a stand-alone program. So you'll wondering why (or more likely, thanking your lucky stars) that I've listed a bunch of "HAL/S-FC options", without instead listing seven separate tables of PASS1's options, FLO's options, etc. That's because all of these PARM-field options were fed only into PASS1. The other passes accessed the options only via invormation passed to them by PASS1 in COMMON memory, in some pre-digested form.

In addition to these named parameters, early documentation indicates that control toggles can also appear in the PARM field.

Control toggles are two-character strings which normally would be embedded in program comments. Generally speaking, program comments are ignored by compilers. But when these particular two-character strings are found within a program comment of an XPL program, they are used to enable certain program features from the point they appear until a countermanding control toggle is found. The control toggles I'm aware of that were supported by one or another version of XCOM are listed below. Not all of them are necessarily suitable for command-line usage anyway; for example, the action of "$|" depends on its particular punch-card column in which it appears. The list below has been merged from McKeeman (p. 148), IR-182-1 (section 13.4), and SUNY XCOM 4.5 source code:

$B — Interlist code bytes in hexadecimal.
$D — Print compilation statistics and symbol table at the end of compilation (initially enabled).
$E — Interlist emitted code (assembly format) and data.
$I — Print Impact summary, indicating variables outside the scope of any procedure which were referenced, plus procedures called. (Default = Off.)
$L — List the compiled program (listing is initially enabled).
$M — List program without auxiliary information (speeds compilation by minimizing string storage usage).
$N — Produce a warning message if a procedure is called with fewer actual than formal parameters.
$Q — This toggle seems to have been available for a while and then discontinued. It may have caused compilation to terminate after processing the library file.
$R — Collect cross-reference data for each symbol (based on statement numbers) and print with symbol table. (Default = On.)
$S — Dump symbol table at the end of each procedure, if any local data is declared. (Default = Off.)
$T — Begin tracing execution of XCOM at this point, during compilation.
$U — Terminate tracing of XCOM.
$V — Expand variable cross reference to include names of procedures referencing data and names of procedures calling other procedures. (Default = Off.)
$X — Do not abort compilation when ceiling on count of severe errors has been exceeded.
$Y — Use '|' as margin marker rather than 'I'.
$Z — Allow the compiled program to execute in spite of severe errors.
$| — Set margin. The portion of succeeding cards starting from the column containing the | will be ignored. Note that this can only be used to make the card width shorter than 80 columns.

If control toggles were able to appear in the PARM field (as well as inside of program comments), then in effect they'd act like unnamed Type 1 parameters. Indeed, you can see examples of these in the sample JCL cards I showed you at the top of this section. But I am unclear as to whether control toggles survived into the latest versions of XPL/I or not, and whether they're acceptable in PARM fields. Currently, parameters with a leading '$' are accepted (in the sense that there's no error message or abend) in PARM fields, but ignored. However, in XCOM-I, there is a command-line option that can be made to serve the same purpose as control toggles in the PARM field: The --extra=N,'STRING' option causes the STRING to be returned to the XPL program upon the first invocation of INPUT(N); subsequent invocations of INPUT(N) get data from the file attached to input N. So for example, if you wanted to use the control toggle $E, but didn't want to modify your XPL source code to add it into a comment, you could use the command-line option --extra=0,'/* $E */'.

Aside: Again, to be clear, XCOM-I itself has no cognizance whatever of control toggles. Control toggles, if available, are provided to and used by legacy versions of XCOM which XCOM-I may have compiled for you.

If you're writing new XPL or XPL/I programs, of course, then these specific options processors aren't going to satisfy your needs. There are still ways you can process options for your own custom purposes. I don't want to go into great detail, since it's unclear who (if anybody) will actually be producing any new XPL programs, so here I'll just give it the once-over-lightly. There are three different approaches of greater or lesser difficulty and (in)convenience.

Aside: Several of the possible approaches involve providing your own C code to supplement the C code output by XCOM-I when it translates your XPL or XPL/I source code. That can be done via CALL INLINE statements in your XPL/I source code, but it's rather cumbersome to do so. A more-elegant way is to instead create an entire C source-code file, and then just place it into the folder of C code created by XCOM-I. Such a file should have an '#include "runtimeC.h"' in it, and proceed from there. If it involves any active code, versus compile-time initialization of variables, you'll still have to have a CALL INLINE in your XPL/I code to actually call the new C function you've defined.

Very important: Any custom file you put into XCOM-I's output folder for any given project will be deleted the next time XCOM-I builds that project. So make sure it's not your only copy! In this sense, using a filesystem link to your custom C file is safer than using the file itself.

The built-in library function PARM_FIELD returns a string containing the value of the --parm command-line switch as-is. You could therefore just write XPL code to parse PARM_FIELD. This will be your easiest course of action.
Or if you would prefer to do things "the Intermetrics way", you can devise an options processor that takes advantage of the machinery already in the XCOM-I runtime library. This lets you feed in your parameters using the --parm command-line switch, using the same "PARM"/"NOPARM" syntax as the other options processors. If you look at the source-code file XCOM-I/runtimeC.c, you'll find several instances of a datatype called optionsProcessor_t, which is a structure whose definition you can examine in the file runtimeC.h. Basically, it's just a structure listing all of the Type 1 and Type 2 parameter names, their abbreviated nicknames, the negations of both types of names, the default value of the parameter, and a "mask". I'll explain what a mask is in a moment. I hope it will be reasonably clear from looking at those examples just how to make your own custom optionsProcessor_t. If you can do that, you'll be ready for XPL/I calls to MONITOR of the form "CALL MONITOR(13, 'USEROPT');". Here, 'USEROPT' is literal; you don't get a choice of what to call it. The C code you need to create to do all of that doesn't require any CALL INLINE statements, and looks something like the following:

#include "runtimeC.h"
optionsProcessor_t myOptionsProcessor = {
	... 
};
USEROPT = &myOptionsProcessor;

Or you could provide C code to directly process argv[] and argc. This may be the hardest approach of all, since as you'll recall, XPL variables are not modeled as C variables. To get any parsed parameters into XPL variables requires numerically knowing the addresss at which the variables reside in the simulated IBM 360 memory (C array memory), as well as knowing the names of the C functions to convert the values into IBM 360 formats. While there's only a handful of such functions (putFIXED, putCHARACTER, ...), I have not documented them as of yet, nor indeed any of the functions used privately by XCOM-I's runtime library.

As far as the "mask" is concerned, among the items returned by the MONITOR(13) function is a 32-bit (FIXED) value referred to as the "options code" or the "option bits". Each type 1 parameter sets or resets (or can be made to set or reset) one bit position in the options code. That's what the "mask" is: A specification of the bit (or combination of bits) the parameter sets. For example, the LISTING2 parameter is associated with mask 0x00000002, and therefore will bitwise OR the options code with 2; whereas NOLISTING2 would bitwise AND the options code with ~2 (0xFFFFFFFE). Or you can just use a mask of 0x00000000, and your new parameter(s) will leave the options code untouched.

Built-In Runtime-Library Functions

Standard XPL has a variety of so-called "built-ins", comprising runtime-library functions callable from XPL code. Some of these bullt-ins can appear on either the right-hand or left-hand side of assignments, and some have to be CALL'd like user-defined PROCEDUREs. XPL/I has roughly the same built-ins, plus-or-minus a few, mostly (but not entirely) defined to have the same functionality. The compiler recognizes these built-ins, and there is no need for them to be declared in any way prior to use. Since these built-ins were mostly written originally in IBM System/360 basic assembly language, the runtime library supplied with XCOM-I has been entirely written in C, without any reference to the original runtime-library source code.

The list below is from McKeeman (p. 140-142), with some alterations due to XPL/I, and some hopefully-helpful notes from me. The parameter descriptions in the list below identify the datatypes of parameters by the following convention:

FIXED — NE_x
CHARACTER descriptor — DE_x
Symbolic name of a variable, with or without a subscript — V

I suppose I should make it clear that the XCOM-I environment is not precisely like that envisaged in the original XPL language as confined to an IBM 360 runtime environment, and as such, built-in functions don't work exactly the same way either. What's described here is how the XCOM-I runtime library's functions corresponding to the original built-in functions work.

Library Function Description From Original Documentation Additional Notes

ABS(NE) This function returns the absolute value of NE. (Note: "80000000", the maximum negative number, has no representable absolute value and returns "7FFFFFFF", the maximum positive number.) (XPL/I only.)

ADDR(V)

A function with numeric value which is the (at most) 24-bit absolute address of the variable (subscripted or not) V. Mostly used in conjunction with COREWORD or COREBYTE.

If the variable has a subscript, then the returned address is that of a specific array element, rather than of the beginning of the array. For a CHARACTER or long BIT string (i.e., for BIT(33) through BIT(2048) datatypes), the returned address is that of the descriptor for the variable, and if you want to find the actual data, you then must first fetch the value of the descriptor and then massage it further. For XPL/I's BASED variables (see IR-182-1 p. 13-3), using the unsubscribed name of the variable for V will return the address in memory where the pointer to the BASED variable's data is stored; whereas adding the subscript (0) to V returns the address of the variable's data itself.

There are two exceptional cases, due to the fact that unlike the original XPL compilers, program code in XCOM-I exists in a separate memory space outside of the simulated XPL memory in which variable exist. Therefore, attempts to use ADDR(V) to find addresses in code space, such as addresses of XPL built-ins or of program labels, have to be treated very differently than addresses of variables, if sensible results are to be obtained.

Insofar as addresses of runtime-library functions are concerned, the only case I'm aware of is ADDR(DESCRIPTOR). In this case, ADDR returns the address immediately following COMMON memory. The reason for this ad hoc behavior is that the XPL/I memory-management system (SPACELIB) uses the location of the built-in variable DESCRIPTOR (see below) to determine the cutoff between COMMON (in low memory) and non-COMMON (in higher memory). But I realized this fact too late in the development process, after I had made decisions about implementation and memory layout that were incompatible with the interpretation of DESCRIPTOR as a normal variable with the desired properties.

Aside: It isn't terribly difficult to undo the incompatible design decisions I mentioned, and sometime in the future I may do so. Technically, I think it's mainly a matter of swapping memory regions 2 and 3. Again technically, it is necessary for the dope vectors of all BASED variables to reside in the lowest 64K of memory, so it would have the effect of reducing the possible number of non-COMMON BASED variables; fortunately, if existing XPL/I code is any guide, that's a pretty-negligible problem.

Insofar as program labels are concerned, as far as I know, there's no mechanism in XPL itself for using the memory address of a program label. Consequently, the only reason for finding the address of an XPL program label is that assembly language (in a CALL INLINE) wants either to branch to that address, or else to somehow use the data in another CALL INLINE at that address. But this notion is based on various assumptions that are false in XCOM-I. Nevertheless, in spite of that fact, we still need some uniform method of handling such "addresses" that allows ADDR and CALL INLINE to interact smoothly.

XCOM-I handles addresses of XPL labels as follows: The labels in any PROCEDURE are assigned a predictable sequence of unique negative addresses. Thus the first label encountered has the address -1, the second has address -2, and so on. The addresses are local to the PROCEDURE, and restart at -1 in each PROCEDURE. ADDR returns these negative addresses. C implementations of the branch instructions in CALL INLINE statements can exploit that by having a jump table of a form similar to:

switch (address) {
  case -1: goto LABEL1;
  case -2: goto LABEL2;
  ...
}

Admittedly, this is not a terribly-efficient solution, but C has no practical computed-goto other than this, as far as I'm aware.

BYTE(DE,NE)

A function with numeric value given by the NE^th 8 bits of the string described DE.

By "string", McKeeman means either CHARACTER data or BIT data. As far as BIT data is concerned, this function works for any data-width; i.e., for BIT(1) through BIT(2048), and not just for "long" BIT string. However, there's some difference between how CHARACTER and BIT data is treated. BIT data is retrieved from memory as-is, whereas CHARACTER data is transparently translated between EBCDIC encoding (in memory) vs ASCII encoding (for manipulation) by the software. The latter behavior is specific to XCOM-I, and wouldn't have been needed by the original XCOM, since back then, all CHARACTER data was encoded in EBCDIC all the time, whether or not it resided in "memory". Moreover, if you try to do so, it is possible to fool XCOM-I's BYTE function into thinking an area of memory is a different datatype than it really is, thus defeating the translation mechanism ... so please don't try to do that.

Note that BYTE can appear either in an expression (such as on the right-hand side of an assignment), in which case it returns a value as just described, or else on the left-hand side of an assignment, in which case it modifies the value stored in memory. McKeeman recommends not using BYTE on the left-hand side of an assigment, due to the possibility of unintended consequences.

The documentation does not explain what's supposed to happen if NE<0 or NE>=len(DE). Alas, that's not a theoretical question, because such uses of BYTE really appear in legacy XPL code. Given the sloppy subscripting accepted by XPL, my guess is that BYTE just grabs whatever happens to reside wherever NE leads it. Unlike the usual sloppy subscripting in XPL code, this would almost certainly be a mistake by the programmer, because the location of string data in memory is dynamically assigned and liable to change during the course of execution, so the programmer can't really know reliable what lies beyond the bounds of a string. Thus my guess is that when NE is out of bounds for the string size, it's probably a mistake on the part of the program that was never detected because it never produced any error messages.

Aside: For example, it happens in the SCAN procedure of XCOM, in the vicinity of line 835 at this writing,
CP = 1;                                                          
DO WHILE BYTE(TEXT, CP) = BYTE(' ') & CP <= TEXT_LIMIT;          
   CP = CP + 1;                                                  
END;                                                             
where TEXT happens to be a string with LEN(TEXT)=1 containing a single blank space, and TEXT_LIMIT=0. Here, BYTE(TEXT,CP) is out of bounds on the very first iteration of the loop. In C, if the operands of the & operator were reversed, then BYTE(TEXT,CP) would never be executed, and the problem would be avoided. But if that kind of optimization of conditionals is a feature of XPL, it's not a feature that's mentioned anywhere to my knowledge, and XCOM-I does not attempt it.

Whether or not that's a true in interpretation, in XCOM-I the BYTE function returns a value of 0 (corresponding to an EBCDIC NUL) if NE is out of bounds, or else silently does nothing at all if on the left-hand side of an assignment.

BYTE(DE) Same as BYTE(DE,0)

CLOCK_TRAP Not supported in XCOM. And it's not supported in XCOM-I either.

COMPACTIFY

A procedure called automatically to repack the free storage area for string data when it is exhausted. Calls can be triggered by ||, INPUT, number-to-string conversions, or an explicit call:

CALL COMPACTIFY;

The variables FREEBASE, FREELIMIT, FREEPOINT, DESCRIPTOR, and NDESCRIPT are used by COMPACTIFY.

The "free-storage area" is where the data for CHARACTER and long BIT-string variables is stored. As a program which involves string variables executes, the strings tend to change size and move around in the free-storage area, thus causing unused "holes" in memory to develop. COMPACTIFY repacks the area so as to remove the holes.

Aside: While long BIT-strings (>32 bits) are technically similar to CHARACTER strings, they differ in the fact that the amount of storage they require is determined at compile-time and does not change after that. The original McKeeman and Intermetrics compilers therefore stored them outside of the free-storage area — or at least in the very lowest addresses of the free-string area — in order to save any execution time that would otherwise have been wasted during allocation of memory or garbage allocation. XCOM-I is not so picky, since even the most feeble modern computer will be approximately 14.3926 gazillion times faster than an IBM 360 was. XCOM-I thus saves (me) a little effort by grouping the long bit strings with the character strings in memory, at an immeasurably-small cost in execution time.

XPL/I's BASED variables are stored in a separate memory region and do not participate in garbage collection by COMPACTIFY. See the earlier discussion of this.

COREBYTE(NE) A byte array identical to the IBM System/360 memory. The subscript is the absolute byte address of the byte selected. COREBYTE may be used on either side of the replacement operator (i.e., =). The function returns a FIXED value, if used in an expression, but only the least-significant 8 bits contain the value. Similarly, if used on the left-hand side of an assignment, it receives a FIXED value from the right-hand side, but only stores the least-significant 8 bits at the specified memory address. The XCOM-I version of this function performs no EBCDIC translation as BYTE (see above) does, so it does not expect the data in memory to be CHARACTER data. Whether this will turn out to be an issue remains to be seen.

COREWORD(NE)

Like COREBYTE, except the subscript corresponds to the word address in memory. Thus an assignment to COREBYTE(4) can change COREWORD(1).

Since XPL has only a 24-bit address space, only the least-significant 3 bytes of the "4-byte word address" NE are used. According to IR-182-1 (p. 13-3), this function differs in XPL/I as follows:

According to "A COMPILER GENERATOR", NE is a word index, or word-aligned address. However, in the Intermetrics version, NE must be a byte address, and the user must himself guarantee that the lower-most two bits are 0's (full word aligned).

I think that what's being implied by this cryptic comment is that McKeeman's COREWORD in XPL worked like this:

COREWORD(0) consists of COREBYTE(0) through COREBYTE(3).
COREWORD(1) consists of COREBYTE(4) through COREBYTE(7).
COREWORD(2) consists of COREBYTE(8) through COREBYTE(11).
and so on.

whereas Intermetrics's COREWORD in XPL/I worked like this:

COREWORD(0) consists of COREBYTE(0) through COREBYTE(3).
COREWORD(1) is illegal.
COREWORD(2) is illegal.
COREWORD(3) is illegal.
COREWORD(4) consists of COREBYTE(4) through COREBYTE(7).
COREWORD(5) is illegal.
COREWORD(6) is illegal.
COREWORD(7) is illegal.
COREWORD(8) consists of COREBYTE(8) through COREBYTE(11).
COREWORD(9) is illegal.
and so on.

To my way of thinking, McKeeman's description doesn't say what IR-182-1 says it says, and unfortunately, McKeeman's clarification that "an assignment to COREBYTE(4) can change COREWORD(1)", does not in fact clarify it very much either.

XCOM-I conforms to the latter (Intermetrics) usage, with the exception that addresses like 1, 2, 3, 5, 6, 7, 9, ... are perfectly fine: COREWORD(1) consists of COREBYTE(1) through COREBYTE(4), and so on. Which conforms just fine to the description in McKeeman as well, if not necessarily to McKeeman's actual usage.

Aside: In point of fact, COREWORD is not used in any software written in standard XPL that's available to me, so I cannot determine empirically which of these choices (if either) is correct. If it is somehow discovered later that McKeeman's XCOM really does need to behave in the manner Intermetrics claimed that it did, then XCOM-I's --xpl command-line switch will be extended to select between the two behaviors. But I don't intend to waste any effort implementing that until/unless I have enough evidence to justify it. For now, it's a moot point.

DATE

A function with the numeric value of the date, coded as

(day of year) + 1000 * (year - 1900)

McKeeman doesn't inform us of anything so mundane as the time zone to which the DATE and TIME built-ins relate, at least not in the text of the book.

I have been told that McKeeman would have use local time here. The submonitor function supplying this value would appear to be called GETIME, and identical versions of GETIME are used by the XCOM3, XCOM45, and HAL/S-FC submonitors, so if McKeeman used local time, then all other legacy compilers available at present did so as well. Exactly how GETIME works isn't clear to me. But if what I understand from looking at the IBM System/360 Operating System MFT Supervisor manual is correct, the IBM 360 seems to have had no conception of UTC at all, or for that matter, of local timezones either; it simply knew whatever time and date the operator manually input into the system at each power-up, and what the operator was supposed to input was the local time and date.

Consequently, the DATE and TIME built-ins return the local date and time, according to whatever locale your computer is set for.

DATE_OF_GENERATION A word variable initialized with the value of DATE during compilation of the program being run. See DATE above.

DESCRIPTOR(NE) The description of the NE^th string as a numeric value. This cryptic description is referring to the fact that in the way memory is allocated for the variables declared by XCOM, the 32-bit string descriptors for all CHARACTER and BIT(n) (n>32) variables appear consecutively in memory. By symbolically labeling the very first of those descriptors as DESCRIPTOR, and treating DESCRIPTOR as of the FIXED datatype, DESCRIPTOR becomes an easily-accessible array of all the string descriptors. This is useful, for example, for memory-management procedures like COMPACTIFY. The number of elements of the array is given by the built-in NDESCRIPT (see below).

EXIT

A procedure

CALL EXIT;

which causes an abnormal exit form XPL execution.

In fact, XCOM-I models this a C-language exit(1).

FILE(NE₁,NE₂)

An array-valued pseudovariable for manipulation of random-access bulk storage. Examples of its use are

DECLARE BUFF(3600) BIT(8);BUFF = FILE(I, J);FILE(I-1, J) = BUFF;

One record is transferred into or out of the buffer array by the assignments shown above. FILE cannot appear on both sides of the same assignment.

I've actually truncated McKeeman's description of FILE, because it is so long and so much of it is specific to IBM System/360, while being quite irrelevant to us. The key facts are these:

"Random access files" may be attached to the running program.
Once attached, they are identified by a "device number" that ranges from 1 to 9.
Each attached random-access file may be for input-only, output-only, or both.
Each of the attached files consists of records of a fixed size.
Different files may have different record sizes.
These random-access files, and the associated FILE operations, are completely distinct from the "sequential files" (numbered 0 through 9) accessed via the INPUT and OUTPUT functions (see below).
Record sizes typically do not change during program execution. (But see the continued comments below.)

By default, no random-access files are attached to programs. Originally (back in 1970), they were attached at runtime via Job Control Language (JCL) DD cards. But with programs compiled via XCOM-I, files are attached at program startup by using the program's command-line switches. The relevant switch is --raf, and multiple instances can be used on the same command line.

--raf=I,R,N,F

The parameters of the --raf switch are:

I is either a literal I (for "input"), a literal O (for "output"), or a literal B (for "both input and output"). Note that for O, an empty file will always be created by the program; for B, if there's an existing file of the specified name, then it will be used, but an empty file will be created otherwise.
R is the record size associated with the file. 3600 and 7200 are typical values, but not the only ones in use. They must match the expectations of the program's XPL code for using the data.
N is the device number, from 1 through 9, to which the file should be attached.
F is the relative or absolute path to the file.

In spite of what I said above, the XPL/I documentation (IR-182-1, p. 13-5) does describe one way to change the record size of a random-access file once the program has begun operation, and that is via a call to MONITOR(4,R,N) (see below), where R and N have the same meanings as for --raf. It also tells us that such a call must precede the first use of FILE(N). XCOM-I relaxes this restriction, though it seems that the original restriction should be treated as very good advice to follow. Note, by the way, that the call to MONITOR(4,R,N) has no way to distinguish between input files and output files, so its record-size manipulation is applied simultaneously to input file N and output file N, if they differ.

FREEBASE A word variable containing the absolute address of the top of constant strings and thus the bottom of the repackable area. In XCOM-I, this is implemented as a function rather than a variable, and is not directly modifiable by user code.

What the description from the original documentation is trying to say is that most kinds of variables — such as FIXED or BIT(1) types, or descriptors for CHARACTER or long BIT-string types, or arrays thereof, or pointers to BASED variables — are stored in memory at low addresses determined at compile time, and remain unchanged in address thereafter. In contrast, data for CHARACTER variables can grow or shrink or move at runtime. FREEBASE is the boundary between those regions of memory.

In XCOM-I, data allocated at runtime via the MONITOR(6,...) or MONITOR(19,...) function (see below), sometimes for storing data of for BASED variables, also appears above the FREEBASE boundary.

FREELIMIT A word variable containing the absolute address of the last usable byte in the string-data area. I believe that the description lies, and that FREELIMIT is the first unusable byte beyond the string-data area, rather than the last usable byte in it.

In XPL, all memory beyond FREELIMIT is unused by the program. In XPL/I, the data for BASED variables is stored at FREELIMIT and above. However, the memory-management technique for is to push FREELIMIT downward in order to make room for BASED variables. So for either XPL or XPL/I, the initial value of FREELIMIT is the highest memory location used by the program. XCOM-I sets the initial value of FREELIMIT at "FFFC00" (i.e., 1024 bytes below the top of physical memory), in order to leave a small amount of space for COMPACTIFY to use as scratch memory.

FREEPOINT A word variable containing the absolute address of the next free byte in the string-data area. When FREEPOINT passes FREELIMIT, COMPACTIFY must be called. In other words, at any given time, only the memory region between FREEPOINT and FREELIMIT is available for dynamically-allocating new free memory for BASED variables or CHARACTER variables. as the program continues to execute, dynamic memory allocation and subsequent memory-freeing may cause FREEPOINT to steadily creep upward, and the area from FREEBASE to FREEPOINT to become increasingly fully of unusable holes. Eventually, COMPACTIFY will be automatically called, repacking the data to eliminate the holes, and moving FREEPOINT downward in memory once again.

INLINE(NE₁,...)

Inserts arbitrary IBM System/360 machine code directly into the instruction stream.

The applicability of the original functionality has change drastically in the 55 years (as of this writing) since A Compiler Generator was written. XCOM-I provides three different mechanisms for supporting direct insertion of inline non-XPL code, none of which are identical to those in McKeeman. Thankfully, no available legacy standard XPL program I've encountered contains any CALL INLINE statements, so the difficulties presented by them are limited to XPL/I code such as HAL/S-FC.

Support mechanism 1: insertion of C code

You can use

CALL INLINE('C-language source code');

to insert an arbitrary string of C-language source code directly into the instruction stream. If the string contains newline characters ('\n'), then a single INLINE can insert multiple lines of C code. But recall that an XPL string can be no longer than 256 characters, so the inserted C code can be no longer than 256 characters in any one CALL INLINE statement.

Support mechanism 2: insertion of IBM 360 machine code

XCOM-I will automatically translate IBM 360 machine code to C. Full-blown IBM 360 emulation is not provided, but support is available for those IBM 360 instructions used in XPL code I have personally encountered, which includes the entire source-code base of HAL/S-FC. Instructions that are not (yet) supported cause compilation to abort.

This feature is not normally enabled. You must use the XCOM-I command-line switch --auto-inline to enable it.

Support mechanism 3: other

If --auto-inline is not enabled (see support mechanism 2 above), any CALL INLINE statements not caught by support mechanism 1 are commented out by default, but "patched" with C code if a "patch file" is available. This important but far from automatic mechanism is the topic of a later section.

INPUT(NE)

A function with descriptor value specifying the next record on input file NE.

I've truncated McKeeman's rather roundabout description, because I don't find it very informative as-is to more-modern readers.

What McKeeman is trying to say is that any given program can have several files attached to it for so-called sequential input. You can think of these files as consisting of lines of text, and you can read them, one text-line at a time, by using the INPUT built-in function. For the original McKeeman XCOM (or the Intermetrics version of XCOM), the text in the sequential files would have been encoded in EBCDIC, but for use with XCOM-I they are encoded in ASCII.

Because XPL comes from the era and computing environment it did, it expects input to be supplied on computer punch cards. As such, it expects lines of input to actually be 80 columns wide, even if they need to be padded by spaces to do so. Not all XPL programs have problems with lines of other lengths have a problem with lines of a different length, but some do. Notably, the original XCOM is one of the programs that behaves incorrectly unless it gets its way. Because of that, XCOM-I's built-in INPUT function transparently truncates lines to 80 columns or pads them 80 columns as necessary.

There are up to 9 input files attached to the program, numbered, 0 through 9 — yes, I know that's 10 files, but just keep reading — and you access them via those "device numbers". For example,

DECLARE CHARACTER C;C = INPUT(5);

reads a single line from sequential file number 5 into the variable C.

Important: These "sequential files" and the INPUT mechanism are completely separate from the "random-access files" described in FILE's notes earlier!

By default, there is a single sequential file attached to the program, though it is attached to both device number 0 and device number 1. The default attached file is stdin, and thus any text piped into the program via a redirector (<) on the program's command line will be available on both INPUT(0) and INPUT(1).

Aside: In case you're wondering, I don't think that the reason two separate devices are associated with stdin is very deep or interesting. I imagine it's just kind of symmetric to the fact that OUTPUT(0) and OUTPUT(1) (see below) are both attached by default to stdout. In situations like this I'm tempted to quote Emerson — "a foolish consistency is the hobgoblin of little minds" — but I suspect the decision was pragmatic rather than foolish.

There's no explanation or obvious provision in the documentation for what happens when an end-of-file is reached, and XCOM-I makes no particular provision for it either. I suppose it must depend on the particular program. For example, some XPL programs assume that a blank line terminates a file. Other programs may look for special patterns, such as the string "EOF". XCOM-I returns an empty string for reads past the end of the file.

Additional input files can be attached via the program's command-line switch --ddi:

--ddi=N,F

Here, F is the pathname to the file, and N is the device number to which it should be attached. By the way, if you want to debug your program once it has been translated by XCOM-I to C and then the C has been compiled to an exectuable, I've sometimes found piping input into the program via a command-line redirector (<) to be somewhat problematic. It's handy in those situations to override the default attachment of stdin to device 0 by using --ddi=0,F instead of piping.

There is no way to change the attachments once the program is running, although there are MONITOR calls (see below) which can close attached files at runtime.

In XPL/I, there's an additional alternative, in which Partioned Data Set (PDS) files can be attached for use by INPUT instead of sequential files. A PDS is partitioned into sections, each with its own 8-character identifying name (right-padded with blanks if necessary). An attempted read past the end of a partition returns an empty string.

Only one partition of the PDS is available via INPUT at any given time, but

F = MONITOR(2, N, NAME);

(see below) can be used at any time to switch device N to the partition whose name is contained in the string called NAME. The value returned, F, is 0 if the switch was successful, or 1 if there was no such partition.

A PDS is thus similar to a folder of sequential files, with the partition names corresponding to filenames within the folder.

By default, no there is no PDS attached to programs. However, a PDS can be attached for input via the program's command-line switch --pdsi:

--pdsi=N,F

Because of PDS similarity to folders, that's how they're implemented in XCOM-I. So F should be the pathname of a folder, and the partitions should be ASCII text files within the folder. N, of course, is the device number on which the attachment is to be made. However, no specific partition is selected for reading until the necessary MONITOR(2, N, NAME) call has been made.

INPUT Same as INPUT(0)

INTERRUPT_TRAP Not supported in XCOM. Not supported in XCOM-I.

LENGTH(DE) A function with the numeric value equal to the number of characters in the string denoted by the parameter.

LINE_COUNT This function returns the number of lines which have been printed on the SYSPRINT file since the last page eject. (XPL/I only.) "SYSPRINT" refers to OUTPUT(0) and OUTPUT(1) (see below).

LINK This procedure performs the functions necessary to exit the current program phase and pass control to the next phase on the PROGRAM DD sequence, preserving COMMON data and any other dynamically allocated space which has not been deallocated. (XPL/I only.) This refers to the notion that a sequence of XPL/I programs is being run via Job Control Language (JCL), with each program passing data to the next program in sequence. However, this LINK built-in is specific to the original computing environment, and performs no function in XCOM-I, which has a different mechanism sharing data between program passes. Refer to the extensive discussion concerning COMMON memory.

MONITOR(NE₁,...) Calls the "submonitor". The "submonitor" is a separate program from whatever XPL or XPL/I program is being compiled and run, providing some kine of system-specific functionality. In so far as McKeeman is concerned, the submonitor's functionality and even it calling sequence was unknowable, and thus the McKeeman XCOM system simply discarded all calls to the submonitor.

Not so with XPL/I: There is an extensive set of functions provided by the submonitor, and the XPL/I program HAL/S-FC uses them extensively as well, so XCOM-I needs to support them. Most (I hope!) but not all submonitor functions are known from their descriptions in section 13.3 of IR-182-1. Besides which, HAL/S-FC source code contains the source code for MONITOR (written in IBM 360 Basic Assembly Language), from which additional information can be obtained.

Because of the complexity of MONITOR, an entire separate section is devoted to explaining it.

MONITOR_LINK A fixed array which can be used for transmission of information between a program and the submonitor. (See Table 8.4.6 in McKeeman.) I'm not presently aware of any need to implement this in XCOM-I. However, there are MONITOR functions (see above) whose operations remain a mystery, so there's a possibility that that could change.

NDESCRIPT(NE) A word variable containing the upper bound in the array DESCRIPTOR (see above) of the descriptions. See DESCRIPTOR above.

OUTPUT(NE)

This is the analog of INPUT (see above) but for outputting textual data rather than inputting it.

Most of the comments concerning INPUT (see above) are directly applicable to OUTPUT, or else are analogous in a very obvious way, so I'll try to confine my remarks to the clear differences between the two.

The most common use of OUTPUT is to "print" to the "SYSPRINT" device. Unlike the original XCOM computing environment, in which SYSPRINT was an actual printer, in XCOM-I SYSPRINT by default refers to stdout. And by default, stdout is attached to both OUTPUT(0) and OUTPUT(1). But there are differences between the two! Use of OUTPUT(0) is simpler, and therefore more common, but OUTPUT(1) is more flexible. Let's start with OUTPUT(0).

Output statements via OUTPUT(0) might look something like this:

OUTPUT(0) = 'Hello, world!';OUTPUT(0) = 'Hello' || ',' || ' ' || 'world' || '!';OUTPUT(0) = 'This is OUTPUT statement number ' || 3;OUTPUT(0) = 25;

Thus we can print any character strings or numbers we like, and can concatenate them using the string-concatenation operator (||), with XCOM-I doing the work of automatically converting numbers to strings where necessary.

Each use of OUTPUT(0) results in another physical line being printed. When printing messages with OUTPUT, since it is expected that the lines are being output to a physical printer, the system keeps track of the number of lines being printed, and automatically inserts page breaks in the output once a page has been filled up. At the tops of pages, a page number and optional headings and subheadings are also printed. The headings and subheadings, though, are simply whatever has been previously set up, and you can't change them or otherwise influence them using OUTPUT(0).

With OUTPUT(1), on the other hand, there's quite a bit of additional functionality. For example, it isn't necessarily true that each use of OUTPUT(1) prints another line. The first character of each OUTPUT(1) isn't directly printed, but instead is a "carriage-control character" (or "ANSI control character"), as follows:

' ' (i.e., a space) — Single-space the line and print. This is the same behavior as OUTPUT(0): i.e., any line you print to OUTPUT(0), you could instead prefix it with a space character and send it to OUTPUT(1) instead.
'0' — Double-space the line and print.
'-' — Triple-space the line and print.
'+' — Overstrike the preceding line. In other words, go back to column 1, without advancing to the next line, and then print the remainder of the string. Unfortunately, that's behavior suitable for a physical printer, but not really for outputting text to stdout on most computers today. Yes, we may be able to emulate it using strange maneuvers which most computers aren't set up for, but not in any universal, easily-set-up way. So in XCOM-I, the string is instead printed on a new line, almost as if the ANSI control character had been ' ' (see above) rather than '+'. Almost! You see, what these overstruck lines are mostly used for is underlining text. In HAL/S-FC, for example, text which has undergone replacement via macro expansion is underlined in compiler reports. The temptation is to mimic underlining by using underscore characters (_), but when underscore characters are printed on a separate line from the text they're underlining, they don't really look like underlining, but rather like overlining on the succeeding line ... which while it may be better than nothing, is absolutely not what you want it to look like! XCOM-I thus takes a slightly-different approach: It replaces such underscores by carats (^), which in my opinion are more-effective visually. For example, where one would have gotten something like

... MYTEXT ...

on a line printer back in the olden times, today with XCOM-I you'll instead see the uglier but still effective

... MYTEXT ...
    ^^^^^^

'1' — Form feed. I.e., advance to the top of the next page, regardless of how many left-over lines remain on the current page. In XCOM-I, this is accomplished by embedding the ASCII form-feed character ('\l') in the output.
'H' — Heading line. This doesn't print anything, but takes the remainder of the line and sets it as the heading for subsequent pages.
'2' — Subheading line. This doesn't print anything, but takes the remainder of the line and sets it as the subheading for subsequent pages.

Note: What this explanation of OUTPUT(1) fails to explain is what happens if the string doesn't have a first character; i.e., what if it's empty? Well, the obvious response would be that that would be an illegal use. Unfortunately for us, that obvious fact didn't stop some of the coders of HAL/S-FC from strewing their code — perhaps an exaggeration — with assignments OUTPUT(1) = '', thus forcing us to deal with it. Lacking any particular rationale for what such a thing might mean, XCOM-I treats it as OUTPUT(0) = ''; i.e., it prints a newline.

If you experiment with OUTPUT(1), and even to a certain extent with OUTPUT(0), you may become very confused, because various things won't work as you expect. Or at least, I was very confused at first. To get past this confusion, you need to grasp the following

Very important point: For the computer systems most people are familiar with today, when we print a line of text, we expect that line of text to end with an implicit advance to the next line; i.e., with a "newline" character, often signified in programming languages by '\n'. Whereas in System/360 (and therefore in XPL or XPL/I), the implicit newline occurs prior to the text being printed by the OUTPUT command. I.e., today we typically ask ourselves "what happens at the end of the line"", whereas for XPL you need to ask "what happens at the beginning of the line?".

Similarly to INPUT, you can attach additional sequential output files (F) on device numbers (N),

--ddo=N,F

Or for Partitioned Data Sets,

--pdso=N,F

For selecting partitions of a PDS, you don't use the same MONITOR call as for INPUT, but instead use:

F = MONITOR(1, N, NAME);

OUTPUT commands targets for a PDS don't immediately write data to the physical PDS. Rather, the data being output is buffered in memory until the MONITOR(1, N, NAME) call occurs, and the data is then written out to the selected partition in its entirety. The return value (F) is 0 if the partition is new (i.e., if it didn't previously exist in the PDS), while it is 1 if the partion previously existed but has now been overwritten with entirely new contents.

OUTPUT The same as OUTPUT(0).

PARM_FIELD

This function returns a character string which contains the entire parameter specification coded on the PARM= option on the EXEC card. If no PARM is specified, a null string will be returned.

(XPL/I only.) A program of any complexity generally has a number of options selectable at runtime. For XPL/I programs like HAL/S-FC, the mechanism for selecting such options was originally a Job Control Language (JCL) card such as:

PARM='SYTSIZE=1800,REFSIZE=20000,LISTING2,$I,$V,$U,$W'

Thus, PARM_FIELD would have had the value 'SYTSIZE=1800,REFSIZE=20000,LISTING2,$I,$V,$U,$W' in such a case.

In XCOM-I, PARM_FIELD is implemented as a function rather than a variable, but nevertheless returns data of the kind described. Of course, there is no JCL supplying such parameters, but the compiled program nevertheless has a command-line option that does the same job:

--parm='SYTSIZE=1800,REFSIZE=20000,LISTING2,$I,$V,$U,$W'

Aside: Depending on your operating system, some trickery may be involved in correctly forming such a command-line option. In Linux or (I suppose) Mac OS, the command shell expects the dollar sign ($) to indicate that the value of an environment variable is desired. In other words, if the string in the command-line option shown above had no quotes, then (for example) $I would be replaced by the value of the environment-variable I ... probably a blank! And the same thing would happen if the option were enclosed in double-quotes. But by using single-quotes, we defeat that substitution, and $I is reported in PARM_FIELD literally as $I. Alternate tricks to defeating substitution could include using the backslash ('\') escape character in front of all dollar signs ($).

Aside: It has been decades since I used Windows in any serious way, so I don't really know whether there's any similar problem with it. Probably not.

RECORD_WIDTH(V) (XPL/I only.) IR-182-1 doesn't mention it, but it appears to me that RECORD_WIDTH is a new built-in in XPL/I which when given the symbolic name of a BASED variable returns the variable's record size (in bytes). In HAL/S-FC's XPL/I source code (the only available example of use of RECORD_WIDTH), I find it used only for a couple of BASED RECORD variables, which makes sense according to the naming, but I have no way to tell just from those few examples what the actual applicability was. In XCOM-I, it will work for any BASED variable, RECORD or not.

SET_LINELIM(NE) This procedure establishes the number of lines which will be printed on the SYSPRINT file before an automatic page eject and header line will be printed. (XPL/I only.) See OUTPUT(NE).

SHL(NE₁,NE₂) A function with the numeric value given by shifting the value of NE₁ left (logical shift, zeroes appear in the least significant bit position) the number of positions indicated by the value of NE₂. One point I found confusing for quite a while is that you'd suppose that logical shifts would operate on BIT variables, and particularly on long-BIT variables, for which XPL provides no conveniences at all for accessing individual bits. Not so! SHL operates only on FIXED values, and can only shift by up to 32 positions. BIT variables are converted to FIXED if shift operations on them are needed.

SHR(NE₁,NE₂) Logical shift right. SHL and SHR are used in conjunction with &, |, and ¬ for masking and data packing. But recall that it is preferable to use ~ rather than ¬ with XCOM-I.

STRING(IDENTIFIER)

STRING(INTEGER)



STRINGD(INTEGER)

This function transforms the variable NE (which should be FIXED for proper usage) into a CHARACTER descriptor. NE should have the form:

Length - 1	Data Address
8 bits	24 bits

(XPL/I only.) I admit that the description at left is confusing. The whole concept is confusing. Here's my take on it: I believe that STRING is used in one of two situations:

You have an IDENTIFIER for a CHARACTER variable, and you want the string descriptor; or else
By some kind of arithmetical process, a string descriptor (an integer) is available, and you want to make sure the compiler knows it's a string descriptor rather than some arbitrary integer.

It is best to think of STRING as a hint to the immediate consumer of the data that it represents a string rather than an integer. Whatever hint is being supplied does not persist beyond the very next operation. XCOM-I implements STRING using these conversions at compile time, and there is no runtime library function as such that's called STRING.

To make things a bit more confusing, STRING is the one built-in I'm aware of that is also commonly used as a name for a variable. If names of built-ins are declared as variables (or procedures), the user's declaration overrides the built-in for the scope of the declaration.

STRING_GT(DE₁,DE₂)

This function returns a TRUE value if the contents of string DE₁ is greater than the contents of string DE₂, based on the collating sequence of the characters, irrespective of the lengths of DE₁ and DE₂. Otherwise, the value is FALSE. This is functionally equivalent to padding the shorter of DE₁ or DE₂ with blanks and then comparing the strings.

(XPL/I only.) If this seems confusing, it may be helpful to recall that when the XPL relational operators <, >, <=, and >= compare two CHARACTER values, they look first at the lengths of the values, and only if the lengths are equal do they compare the actual character data. For example,

'ZZZ' < 'AAAA'

because 'ZZZ' has only 3 characters, while 'AAAA' has 4. Presumably STRING_GT was introduced because somebody at Intermetrics didn't think that kind of behavior was great, and thus STRING_GT('ZZZ', 'AAAA') will report instead that 'ZZZ' is greater than 'AAAA'.

The collating sequence in either case is EBCDIC rather than the more-usual ASCII. The primary visually-obvious consequence of this is that digits come after letters rather than before them.

SUBSTR(DE,NE₁,NE₂) A function with descriptor value specifying the substring of the string specified by DE, starting at position with NE₁ with length NE₂. I.e., it allocates and returns a new string created by taking a substring of length NE₂, starting at position NE₁ in string DE.

SUBSTR(DE,NE) Like the above except that all characters from NE to the end of the string are taken.

TIME A function with numeric value given by the time-of-day coded as centiseconds since midnight. See the notes for the DATE function above.

TIME_OF_GENERATION A word variable initialized with the value of TIME during compilation of the program. TIME (see above) isn't literally available during compilation. Moreover, XCOM-I implements this as a function rather than as a variable, which should be transparent to the programmer.

TRACE

A procedure,

CALL TRACE;

which causes activation of the instruction-by-instruction trace at runtime.

This is described in quite a lot of detail in McKeeman's Appendix 2. At present, it is accepted during compilation but does nothing at runtime in code generated by XCOM-I.

UNTRACE

A procedure,

CALL UNTRACE;

which turns off run-time trace.

See TRACE.

XPL_COMPILER_VERSION An array of length 2 — actually a function in the XCOM-I runtime library — in which the first element is the major version of the XPL compiler (XCOM-I in this case), and the second element is the minor version, both in the range 0-99. This is just speculation on my part, based on how an undeclared variable with this name is used when it appears a couple of times in PASS2 of HAL/S-FC. Nearby code also provides the interesting information that the XPL version name is "TITAN--XPL". You may recall my earlier observation that Intermetrics merged with Titan Corporation in March 2000, which is obviously the origin of the "TITAN" portion of the name. The other interesting point is that no effort is made to distinguish it in any way from standard XPL.

Of course, I have no inkling whatever of how the versions for the Intermetrics/Titan XPL compiler were numbered. I was tempted to set the major version to 32 and the minor version to 0, but that would only serve to obscure the fact that XCOM-I is in fact not TITAN--XPL, nor even related to it other than that I've tried to imitate it. Instead, XCOM-I will always return a major and minor versions appropriate to itself: major = 0 preliminary versions, 1 for first release, and so on.

As an ad hoc measure, XCOM-I also transparently performs string substitions replacing 'TITAN--XPL', wherever it may be found, by 'RSB-XCOM-I'.

The MONITOR Built-In Runtime-Library Function

As mentioned before, an XPL/I program obtained various services outside what the XPL/I language proper or runtime library could provide, by instead making requests to the separate "submonitor" program. The mechanism was a call of the MONITOR procedure. For XCOM-I, on the other hand, there is no separate submonitor program, and we may as well think of MONITOR as being just another built-in runtime-library function. Well, not just any runtime-library function. A big difference is that it provides a very large number of functions, each one of which can require its own unique syntax, thus necessitating a somewhat more-flexible discussion of how to use it.

The only uniform feature among the many aspects of MONITOR usage is that each separate function it provides is identified by a number, and such a function number is passed to MONITOR as its first parameter. My explanations in the table below are mostly pulled from Chapter 13 of IR-182-1, and then altered according to my understanding (or lack thereof). Functions 24 through 32 are deduced, poorly, from the HAL/S-FC BAL source-code file for the submonitor program (which happens to be called "MONITOR").

A number of the MONITOR(...) functions work with what's called "IBM hexadecimal floating-point" format, and specifically to the 64-bit (double-precision) version of that format, as opposed to the 32-bit (single-precision) version of it. To make the discussion more concise, I'll just refer to it as "DP floating point".

To be perfectly clear, there is no floating-point datatype in XPL/I, there are no floating-point literal constants, and there is no provision whatever to make it convenient for you (the programmer!) to hard-code such constants into your XPL source code, nor to interpret any such hexadecimal constant you find within legacy source code. Rather, you must somehow obtain the hexadecimal equivalents for whatever floating-point constants you wish to use, and then hard-code those hexadecimals into your code. For your convenience — or more accurately, for mine — I've included a little utility called ibmHex.py that you can use to convert back-and-forth between human-readable floating-point numbers and DP floating point. Just run ibmHex.py --help for instructions. This little utility can either be run in a stand-alone fashion, or else imported as a Python module. But I digress!

To understand DP floating point, imagine 8 groups of 8 bits each:

SEEEEEEE FFFFFFFF FFFFFFFF ... FFFFFFFF

where S is the sign, E is the exponent, and F is the fraction. (SP floating point is the same, but with 3 FFFFFFFF-groups rather than 7 of them.) The exponent is a power of 16, biased by 64, and thus represents 16^-64 through 16⁶³. The fraction is an unsigned number, of which the leftmost bit represents 1/2, the next bit represents 1/4, and so on. As a special case, 0 is encoded as all zeroes.

For example, the 64-bit hexadecimal pair 0x42640000 0x00000000 parses as:

Sign = 0 (i.e., positive)
Exponent = 16^0x42-0x40 = 16² = 2⁸.
Fraction = 0.0110 0100 ...

or in total, 1100100 (binary), or 100 decimal.

As in the preceding section, I want to make it clear that the descriptions given here are how the XCOM-I runtime library's MONITOR functions work, and not how the original MONITOR functions as confined to an IBM 360 runtime environment worked!

MONITOR Call Description From Original Documentation Additional Notes

CALL MONITOR(0,n); Closes output file n. I believe this refers to the sequential file or PDS attached for use with OUTPUT(n), and not the random-access file (if any) attached for use with FILE(n).

F=MONITOR(1,n,name); Assumes that a PDS is attached to output device n. Physically writes any data previously buffered in memory by OUTPUT(n) operations into the PDS's member named name. The name parameter is a string 8 characters long, padded with blanks as necessary. Returns 0 if the member hadn't existed previously in the PDS, or 1 if the contents of an existing member of that name was replaced.

F=MONITOR(2,n,name); Assumes that a PDS is attached to input device n. Sets future INPUT(n) operations to pull data from the PDS's member named name. The name parameter is a string 8 characters long, padded with blanks as necessary. Returns 0 if the member was found, or 1 if it was not found. The end-of-data for the member is detected when an input string of 0 length is encountered.

IR-182-1 asserts that devices 4 and 7 have the following abnormal ad hoc behavior very specific to HAL/S-FC. A PDS called "INCLUDE" is normally attached to input device 4, while either "INCLUDE" or "OUTPUT6" is normally attached to input device 7. But member name is sought in device 4 or 7, then name is first sought in "INCLUDE" but upon failure is then sought in "OUTPUT6".

CALL MONITOR(3,n); Closes input file n. I believe this refers to the sequential file or PDS attached for use with INPUT(n), and not the random-access file (if any) attached for use with FILE(n).

CALL MONITOR(4,n,b); Changes the record size of random-access file n to b. Must precede the first use of FILE(n).

CALL MONITOR(5,ADDR(DW)); Sets the location of the double-word work area (DW) for subsequent use by MONITOR(9,...) and MONITOR(10,...). DW is meant to be literal here, at least for HAL/S-FC, and should not be replaced by anything else.

DW refers to an array of FIXED variables used in pairs, to hold 64-bit values. In particular, these pairs are often used to hold double-precision floating-point numbers in IBM System/360 format. MONITOR(9,op) (see below) is then capable of performing various arithmetical operations on those numbers. Since XPL/I itself has no facilities for floating-point variables or operations on them, this workaround is the only available way to employ floating-point numbers in XPL/I programs.

In legacy HAL/S code, I've found that at least 14 words of memory are needed for this work area. In the XCOM-I implementation, the MONITOR(5) call is rejected within 14 words of the end of memory.

F=MONITOR(6,ADDR(basedVariable),n);

Allocates n bytes of storage in free memory, clearing that memory to 0, and assigns the pointer of the basedVariable to point to that newly-allocated space. Returns 0 upon success, or 1 upon failure.

It would be natural to suppose that MONITOR(6) is the mechanism used to allocate memory for BASED variables declared in the XPL/I code. In general, this is not true. Memory for BASED variables declared in user XPL code comes from a large pre-allocated contiguous block of memory, and the sub-allocations of individual BASED variables within that large block are managed using means other than MONITOR(6) and MONITOR(7).

Aside: It appears to me that once upon a time, the intention may indeed have been for BASED variables to be managed by MONITOR(6). The evidence for this is a procedure called DESCRIPTOR_MONITOR in SPACELIB, which does try to allocate or deallocate space for BASED variables in just that fashion. However, DESCRIPTOR_MONITOR is never called, and some of the variables upon which it depends are never declared, making DESCRIPTOR_MONITOR uncompilable. (XCOM-I transparently deletes it, and other functions which are never called.) But in spite of the historical evidence of DESCRIPTOR_MONITOR, I see no such behavior in any of the active code, with a single exception: The memory-management system itself (i.e., SPACELIB) does have a BASED variable, DX, which it continues to manage using MONITOR(6) and MONITOR(7).

My observation is that MONITOR(6) must update more of the basedVariable than just its pointer to its data. Recall that a BASED variable is associated with a 28-byte structure known as a "dope vector", of which the pointer to the data comprises only bytes 0-3. Beyond that, bytes 4-5 give the sizes (in bytes) of each record, while bytes 8-11 give the total number of records allocated so far. (Not to be confused with the total number of records in use; for growth, more records may have been allocated than were strictly needed at that moment.) While MONITOR(6) may not need an entire dope vector to be present at the address it has been given, it needs the first 12 bytes to be present, so that it can use/update the fields just mentioned. I don't see evidence of that in the submonitor source code, but then I don't fully understand that source code; at any rate, this is how XCOM-I treats it.

Note that MONITOR(6) increases the size of basedVariable by n bytes; it does not set the size to n bytes.

F=MONITOR(7,ADDR(basedVariable),n); Frees memory previously allocated via MONITOR(6,ADDR(basedVariable),n). However, basedVariable's pointer is not changed, and thus will continue to point to the freed area until explicitly changed. See the comments for MONITOR(6).

If n is less than the total number of bytes allocated for basedVariable, then the number of records available in basedVariable has shrunk, but basedVariable continues to be usable, with a reduced number f records.

Although described as having the address of a BASED variable as a parameter, this isn't entirely true. It is also sometimes the address of a FIXED variable.

CALL MONITOR(8, dev, filenum);

Set PDS DDNAME.

Apparently, the use of MONITOR(8) changed after the only documentation of it (in IR-182-1) was produced, leaving us to try to infer its usage from context in the HAL/S-FC and MONITOR's source code. The purpose seems to be to change the association of files to device numbers, à la DD's in JCL, but to do so dynamically during execution rather than statically at program startup.

The dev parameter, an integer, would appear to be the logical device number as used in an XPL INPUT(dev) or OUTPUT(dev) (or in a HAL/S READ(dev) or WRITE(dev)).

The filenum parameter is trickier for me to understand, because I'm unclear how DD in JCL associates a dataset name with a file number, so take what I say with a grain of salt. My inference, subject to change, is that if you have a dataset identified with the DD name "INPUTn" (where n is a digit), then that is filenum=n; whereas a dataset named "OUTPUTn" is associated with filenum="80000000"|n. Recall that "80000000" is XPL-speak for 0x80000000. So the filenum is implicit in the dataset name, and the most-significant bit is used to indicate the direction of data flow.

For example, OUTPUT(4) invocations might normally output data to OUTPUT4, but you could associate it instead with OUTPUT2 via

CALL MONITOR(8, 4, "80000000" | 2);

and then reassociate it later with OUTPUT4 via

CALL MONITOR(8, 4, "80000000" | 4);

The description "Set PDS DDNAME" I've given as the description comes from the associated program comment in MONITOR.bal, the source-code for the MONITOR procedure. It would seem to imply that it has something specifically do with Partitioned Data Sets (PDS). It's difficult to see why that would be so; I think it merely reflects the specific usage in HAL/S-FC, which indeed relates to PDS. The XCOM-I runtime library allows it to be used for and PDS or sequential dataset. It is not applicable to random-access files.

Of course, there is no JCL, and consequently no DD statements, associated with an XPL or XPL/I file compiled with XCOM-I, nor with a HAL/S file compiled with HAL/S-FC. The description above has to be applied instead to command-line switches in a hopefully-obvious way.

F=MONITOR(9,op);

Performs floating point evaluation as specified by value of op. Operands are obtained from work area whose address was set up via a MONITOR(5) call. The first operand is taken from the first double word of the work area and the second operand from the second double word. The result is placed in the first double word of the work area. The return code is 0 if the operation succeeds, or 1 if the operation fails (under or overflow).

The values of op are:

OP	Function
1	arg1 + arg2
2	arg1 - arg2
3	arg1 * arg2
4	arg1 / arg2
5	arg1^arg2
6	sin(arg1)
7	cos(arg1)
8	tan(arg1)
9	exp(arg1)
10	log(arg1)
11	sqrt(arg1)

"arg1" refers to the DP floating-point number stored in the pair of work-area valued DW(0),DW(1).

"arg2" refers to the DP floating-point number stored in the pair of work-area valued DW(2),DW(3).

The DP floating-point result of the operation is stored back into DW(0),DW(1), although not all of the operations can necessarily produce results of full DP accuracy from DP operands.

The angular unit for trigonometric operations is the radian.

Note that the values stored in these registers of the work-area are in the binary format of "IBM hexadecimal floating-point", which are not used anywhere else in XPL memory (unless copied from the working area).

F=MONITOR(10,string); Performs character to DP floating-point conversion upon characters in string. The return code is 0 if the result is valid, or 1 if conversion was not possible. The result is placed in the first double word of the work area provided by the MONITOR(5) call. In other words, string is interpreted as a DP floating-point number and stored in DW(0),DW(1). Any of the usual representations for decimal numbers are accepted in the string, including the usual E notation for exponents.

CALL MONITOR(11); No-op.

string=MONITOR(12,p);

Converts floating-point number in the first double word of the work area to standard HAL/S character form. Value of p indicates whether the operand is SP (p=0) or DP (p=8).

In other words, this is the inverse of MONITOR(10,string) (see above).

As far as the "standard HAL/S character form" is concerned, it's described in HAL/S documentation (Programming in HAL/S, p. 3) as follows:

0.0: Printed as " 0.0" (notice the leading space).
Positive: Printed as " d.ddd...E±ee"
Negative: Printed as "-d.ddd...E±ee"

Except for the special case 0.0, the number of printed fractional digits is always the same, although we're not told exactly how many that it is: merely that it is "implementation dependent". I believe, without any basis — and therefore XCOM-I implements — that the number of fractional digits is chosen to provide the maximum accuracy, which is 6 for single precision and 15 for double precision.

address=MONITOR(13,name);

Performs DELETE of current option processor and then LOADs an option processor specified by name. The option processor loaded is called and passed a pointer to the PARM field in effect at the time of compiler invocation. The option processor passes the PARM field and establishes an options table (see Chapter 9 [of document IR-182-1]) whose address is passed back as a return value. If name is a null string, the pointer to the existing options table is returned.

See the earlier discussion concerning program options. In brief, each XPL or XPL/I program has some defined set of "Type 1" options and "Type 2" options. The former are binary (on/off) options, whereas the latter are options with integer or string values.

MONITOR(13,name) returns an address which is used by the XPL program to find options settings for it that have been set up by the run-time library. In C terms, it's the equivalent for argv[] but with more behind-the-scenes interpretation of the parameters. Those program options would originally have been provided by the Job Control Language (JCL) which initiated execution of the program, by means a "PARM field" given on one of the JCL cards. For example, in the JCL

//XPL EXEC PGM=MONITOR,
//         PARM='SYTSIZE=1800,REFSIZE=20000,LISTING2,$I,$V,$U,$W'

the PARM field is the string 'SYTSIZE=1800,REFSIZE=20000,LISTING2,$I,$V,$U,$W', and the comma-delimited items are the individual options selected. Presumably, the operating system wouldn't have been able to directly interpret the meaning of this PARM field on its own, since the types of options would have been specific to the program being run. So an "option processor" specific to the allowed parameters must be loaded to perform that interpretation. In the case of the Intermetrics version of HAL/S-FC, the particular options processor for compiler pass 1 was name='COMPOPT ', whereas it was an empty string for passes 2 and 3, and 'LISTOPT ' for pass 4. (Note the trailing spaces to make the non-empty strings come out to 8 characters. These trailing spaces are not required in XCOM-I.)

Again, just to be doubly clear, MONITOR(13) provides access to options settings not for XCOM or XCOM-I, but rather for the XPL program being compiled by XCOM-I, and specifically for HAL/S-FC.

XCOM-I supports the pre-existing built-in options processors 'COMPOPT', 'LISTOPT' and 'MONOPT'. (MONOPT provides the parameters described in McKeeman, but they are passed to the XPL/I program in the Intermetrics manner rather than being confined to the submonitor as in McKeeman.) Plus, it is possible (as described in the discussion of program options) for the user to define their own custom options processor called 'USEROPT'. Whenever MONITOR(13) is called, it saves the name and the PARM field to a file called "monitor13.parms". If called with an empty name, or more typically as MONITOR(13,0), it simply reloads the contents of monitor13.parms and reuses those.

The address returned by MONITOR(13) is a pointer in memory to the beginning of a block of 6 consecutive FIXED values:

The first is a FIXED value (called OPTIONS_CODE or OPTION_BITS) that collects 32 bit flags, each of them (in principle) itself representing a type 1 (on/off) option. Refer to the source-code files COMPOPT.bal, LISTOPT.bal, and MONOPT.bal (in the MONITOR.ASM folder of the HAL/S-FC source code) for the detailed list of which options correspond to which of the 32 bits.
The remaining five values are the pointer addresses for BASED FIXED arrays: CON, PRO, TYPE2, VALS, NPVALS (or MONVALS), each of which the options processor has already allocated memory for and equipped with data:

CON: This is an array of string descriptors for the selected type 1 options. HAL/S-FC uses it just for printing a report in pass 1, while using OPTIONS_CODE in place of it for any more-significant purpose. Our earlier section on program options lists the order in which the parameters appear in CON.
PRO: In principle, this is like CON, but instead contains the opposites of the selected options. For example, if "LISTING2" appears in CON, then "NOLISTING2" appears in PRO.
TYPE2: This array contains the names (as string descriptors) of options of type 2 (i.e., options which can be assigned value other than on/off). They are presented in the order listed in the discussion of program options.
VALS: This array contains the values of the options of type 2, in the same order as the names in the TYPE2 array. Once again, see the discussion of program options.
NPVALS/MONVALS: This is like CON/PRO, except that in principle it contains values of options which are not among the list printed by HAL/S-FC, which roughly speaking, is half of them.

F=MONITOR(14,n,a);

For creating Simulation Data Files. The value of n selects a function; the value of a supplies supplementary data:

`n`	Function	`a`
0	Open	option flags
1	Write	area address
2	Stow & Close	member name

Unfortunately, the only documentation of this is from 1976, and it may be somewhat preliminary in nature.

For example, from the source code of HALS/FC (see the OUTPUT_SDF procedure in PASS3), the function numbers (n) appear to be 0, 4, and 8 (rather than 0, 1, and 2 as it says in the description to the neighboring column).

It appears to me that it's used somewhat like this:

First you "open" the SDF file. It apparently consists of "pages", numbered from 0 upward. The only "option" you can give it is the maximum page number.
Then you "write" to the file a page at a time, starting at page 0. Each "write" transfers data from the "area address" you specify to the next higher page in the buffer.
Finally, you "stow&close" the SDF file. This is the step which writes the memory buffer out to the actual SDF file (if it hasn't been doing so already), which is a partitioned data set (PDS). The member name of the PDS is assigned at this point. Note that the member name can be one that already exists in the PDS, which will cause the existing member to be overwritten.

Thus you start with a presumably-empty PDS (which happens to be attached to output device 5), and end up with a PDS populated with lots of named members, each one of which consists of some integral number of "pages".

I=MONITOR(15);

Returns Revision Level and Catenation Number from last MONITOR(2) call. Catenation number is obtained from PDS directory data and Revision Level from user data field as specified in the HAL/SDL ICD. The values are returned in the left and right halfwords of the result.

I'm not fully up-to-speed as to what the "revision level" and "catenation number" are, or how to obtain them. Both terms refer to the properties of Partitioned Data Sets (PDS). At present, the XCOM-I runtime library merely returns the hexadecimal value F0F00000. As far as I know, the values returned by this function are only for printing on compiler reports, so the value returned may not be terribly significant.

What follows is my (poor!) attempt to understand what this is supposed to be going on. For this discussion, think of PDS as being a directory on a disk drive, and think of the members of the PDS as being files within that directory.

"Revision level": Apparently, whenever a member of a PDS is overwritten, the preceding revisions of that member are not discarded, but rather are persistent, and each is assigned a "revision level". Or at least, whether or not the data is persistent, the system is cognizant of the number of times the data has been overwritten. Logically, the revision levels are 0 through 99, but those are represented in the MONITOR(15) return value as pairs of EBCDIC digits. The revision increments with each overwrite. In other words: F0F0 ("00"), F0F1 ("01"), and so on. What happens after revision 99 isn't clear to me.
"Catenation number": Also apparently, it was possible within JCL to attach a list of PDS to given device number, versus just a single PDS as currently allowed by XCOM-I. Then, presumably, when searching for a PDS member using MONITOR(2), those PDS would be searched in sequence until the desired member was found. The "catenation number" was the index within that list of PDS at which the member was found.

Regarding potential future implementation of these concepts in the XCOM-I runtime library, I think it would involve prefixing the revision level to the filenames in which PDS member data is stored.

CALL MONITOR(16,n); Sets flags in byte to be returned as high order byte of return code at end of compilation. Flags are passed as rightmost byte of full word n. If high order bit of n is zero, flags are OR'ed into existing flags. If high order bit of n is one, flags replace existing flags. The flags are indeed accumulated in a global variable called flags16 internal to the XCOM-I runtime library, but are not currently used as an exit code.

CALL MONITOR(17,name); Causes name to be copied to third parm field (if any) passed to MONITOR by the program that invoked the compiler. See HAL/SDL ICD. The name is presently saved in a variable called programNamePassedToMonitor within the XCOM-I runtime library, but is not used for anything.

T=MONITOR(18); Returns elapsed CPU time since beginning of run in units of .01 seconds.

F=MONITOR(19,addressList,sizeList); This is an extended form of MONITOR(6) (see above), accepting arrays of addresses of BASED variables and their block sizes. Unlike MONITOR(6), the newly-allocated blocks of memory are not cleared to 0. XCOM-I does not implement this function. Besides, it's unclear how to represent a "list".

CALL MONITOR(20,addressList,sizeList); This is an extended form of MONITOR(7) (see above), accepting arrays of addresses of BASED variables and their block sizes. XCOM-I does not implement this function.

I=MONITOR(21); Returns remaining amount of memory (in bytes) that's free for allocation via MONITOR(6). For Intermetrics XCOM, I believe, MONITOR(6) could use any contiguous block of free memory, including those that were candidates for garbage collection. Whereas XCOM-I only allocates memory within the contiguous block at the end of the free-memory space. Consequently, this function would originally have found the largest free block wherever it was located, whereas XCOM-I merely reports the size of the final block of free memory.

F=MONITOR(22,0,n₂);

            F=MONITOR(22,n₁);

Calls SDF access package.

Note: This function is not yet implemented in the XCOM-I runtime library. However, unlike other unimplemented functions (whose use results in a runtime abend), this function can be used without negative effect other than printing an informative message on stderr.

SDF stands for Simulation Data File. The "SDF access package" (SDFPKG) was a separate program, similar to the submonitor itself, with which the XPL/I program could communicate via MONITOR(22). And like the submonitor, in the XCOM-I framework, SDFPKG equivalents are built into the runtime library rather than being provided by a separate SDFPKG program.

Documentation can be found in TBD. There is not space for a full description here, so I'm only giving a brief summary; the documentation should be consulted for more details.

The first form, F=MONITOR(22,0,n₂), initializes the SDFPKG functionality. The n₂ parameter points to an array of 30 FIXED values which serves as a "communications area" holding data for subsequent calls.

For calls of the second form, F=MONITOR(22,n₁), the n₁ parameter specifies a function number ("mode"):

Terminate SDFPKG.
Augment paging area.
Rescind paging area augments.
Select an SDF.
Locate pointer.
Set disposition parameters.
Locate directory root cell.
Locate block data cell given block number.
Locate symbol data cell given symbol number.
Locate statement data cell given statement number.
Locate block data cell given block name.
Locate symbol data cell given block name and symbol name.
Locate symbol data cell given only symbol name.
Locate statement data cell given SRN.
Locate block node given block number.
Locate symbol node given symbol number.
Locate statement node given statement number.
Locate initialization data given symbol number.
Deselect an SDF.

There's some confusion in the available software/documentation, in that sometimes function #19 is missing; and when #19 is missing, #18 may either be as shown or else may be deselection of the SDF. I have no explanation for that discrepancy. However, the functional lineup listed above is consistent with the version of HAL/S-FC source code we have.

Additionally, the function number n₁ may have high-order bits set, such as 0x10000006 or 0x80000007. The significance of that is TBD.

The return code, F, for all calls is 0 upon success, or non-zero for various failures. See section 11 of the documentation.

descriptor=MONITOR(23); Returns the descriptor of the 10-character string obtained from the ID field of the File Control Block of the first phase of the compiler. The ID field is maintained by the XPLZAP program and contains the identifying string printed on the header of each page of the HAL listing". I think that what the original documentation was trying to get at is that HAL/S-FC's version code, printed on each of the report pages printed by XCOM (or now, XCOM-I), is given by this ID string, and that the ID string is provided to each pass of HAL/S-FC via a call to MONITOR(23). Of course, in the modern reimplementation, there is no XPLZAP program, and it wouldn't be compatible with XCOM-I if there were one. In XCOM-I, the ID string itself is by default 'REL32V0 ', but it can be changed via the XCOM-I command-line parameter --identifier.

As for what a "descriptor" is, see the description of the CHARACTER datatype. In brief, the return value is not itself the ID string, but an integer value (i.e., of the FIXED datatype) that provides an index for finding the ID string in memory.

MONITOR(24) Read a block of a load module. TBD

MONITOR(25) Read a mass-memory load block. TBD

MONITOR(26) Read a MAF (memory analysis file) block TBD

MONITOR(27) Write a MAF block TBD

MONITOR(28) Link to dump analysis service routine TBD

MONITOR(29) Return current page number TBD

MONITOR(30) Return JFCB as string TBD

CALL MONITOR(31

,n,recnum)

Virtual-memory lookahead service.

This function is seemingly only used internally by the virtual-memory system.

In looking at the the usage in actual XPL/I code, as well as the assembly-language source code for the submonitor (see VMEMSRV in MONITOR.bal), which admittedly I can only follow very imperfectly, it appears to me that this function is used in three distinct ways:

MONITOR(31,n,-1), n > 0, sets the default file number for future MONITOR(31,...) calls to n. The default file number is initially 2.
MONITOR(31,bufferAddress,recnum) , recnum ≥ 0, immediately initiates an operation that reads record recnum from the default file, into a memory buffer at address bufferAddress. In IBM 360 terms, that's an "XDAP" operation. I think it's like what we'd call a DMA operation now, so this read operation may take some time to complete, but it's a background operation while it's doing so.
MONITOR(31,0,-1) waits until a previously-initiated read operation on the default file has completed.

There's an additional subtlety — really, that's astounding?! —, in that the bufferAddress in MONITOR(31,bufferAddress,recnum) may sometimes have its most-significant bit set to 1, thus turning it into a negative number. I think in that case what happens is that the function waits for any background read-operation already in progress to complete, and then clears the sign bit of bufferAddress and commences a new read operation.

The XCOM-I runtime library, doesn't really need any of these subtleties, so implements this function basically as immediate FILE operations, as appropriate.

F=MONITOR(32) Find out subpool minimum size The memory-allocation and -deallocation functions (MONITOR 6, 7, 19, and 20) are supposed to get their memory from something called "SUBPOOL 22". SUBPOOLs apparently have block-sizes which are either 2048 or 4096 bytes, depending on the operating system. MONITOR(32) returns the appropriate block size for the operating system. Given that the block size is bigger in later versions of the operating system, I infer that bigger is better, within certain limits, and hence XCOM-I arbitrarily returns 4096.

MONITOR(33) Find out FILE max REC# and BLKSIZ TBD

Debugging XPL Programs

It is admittedly unlikely that many people will be writing new XPL or XPL/I programs nowadays, and will instead be compiling only legacy XPL or XPL/I programs ... assuming that I haven't already compiled all of them first! (Which is a distinct possibility.) Since such legacy programs will presumably all have been debugged decades before, there's not as much need for a debugger as there is for computer languages in which there are many active developers. Nevertheless, XPL or XPL/I programs can be run under a debugger to a certain extent.

The trick is to realize that while we have no debugger for the XPL language, we do have debuggers for C, and once XCOM-I has translated XPL or XPL-I software into C, such C debuggers can be used. It is merely necessary when compiling the C code to include the necessary command-line switches that cause the C compiler to incorporate debugging information into the executable. For the gcc C compiler, that command-line switch is -ggdb, and incorporating it can be done simply by adding it to the make command like via

make EXTRA=-ggdb ...

Having done this, any debugger which can accept gdb style debugging info, such as gdb itself or full-screen wrappers for gdb, can now be used for debugging. For other C compilers or other debuggers, perhaps different switches would be needed.

But debugging XPL code via its C translation is not as convenient as a native XPL debugger would be.

The principal difficulty in debugging the code in C vs in XPL is that XCOM-I doesn't model XPL variables as C variables, but rather as numerical locations in an array called memory. Moreover, the format of this data in "memory" is that of the IBM System/360 rather than the native format of the computer on which debugging is being performed. While in my opinion, these design choices for XCOM-I were are all necessary and unavoidable, there's no denying that they make it trickier to use a debugger to see the how the values of variable change during execution.

I have, however, provided a few C functions in the runtime library to make it a little less painful to examine XPL variables within a debugger. You can also use the XPL CALL INLINE feature to directly embed calls to these functions within your program, without running a debugger at all, if it's more convenient for you to do so.

The debugging functions are:

void printMemoryMap(char *heading) — This function prints out the entire current state of the memory map. I.e., the addresses of all variables, including dynamically-allocated ones, and the contents of all of those variables. Of course, for an XPL program any complexity, the printout is quite long, so this function is presumably used sparingly. The heading parameter is simply a message printed at the top, which can be helpful if you call printMemoryMap several times (perhaps via CALL INLINE) in the same program run. In a gdb console, you could run it via the command call printMemoryMap("..."), whereas you could instead embed it in your XPL source code via CALL INLINE('printMemoryMap("...")');.
char *getXPL(char *identifier) — Returns (as a C string) the value of a single XPL variable whose name is given by the identifier. The identifier string can be any identifier expression that's syntactically correct in XPL/I, provided that subscripts consist entirely of decimal digits, possibly with a leading minus sign. This includes expressions like "V", "V(5)", "B(3).V", or "B(3).V(-5)". Recall that in XPL/I, subscripts can be applied to scalar variables. If you are querying a BASED RECORD, then be sure to include the desired field, since while getXPL knows how to print an individual field of a RECORD, it does not know how to print a collection of fields such as a RECORD. In a gdb console, you could use getXPL with a command like print getXPL(...). For example, print getXPL("C1(4)").
void printXPL(char *identifier) — This provides the same functionality as getXPL, except that it prints its output to stdout rather than returning it as a string. In a gdb console, you could run it via the command call printXPL("..."), or you could instead embed it in your XPL source code via CALL INLINE('printXPL("...")');.
int bitBits — By default, getXPL and printXPL print the data of a BIT variable in hexadecimal notation. In contrast, legacy XPL source code has often logically partitioned the data in BIT strings into subgroupings of 1, 2, or 3 bits (rather than 4 as for hexadecimal), thus using literal constants (such as initializers in declarations) that are in binary, base-4, or octal notation. In those cases, it's difficult to relate hexadecimal strings returned by getXPL with the literals shown in the XPL source code. The global variable bitBits addresses this by allowing you to change the radix used for the BIT data. By default bitBits is 4 (hexadecimal), but can be changed to 1 (binary), 2 (base-4), or 3 (octal). In a gdb console, you could change via a command like set bitBits=2.

In using these debugging functions, note that they all require mangled forms of variable names and parameters of PROCEDUREs. Mangled names consist of the names of the variables or parameters as DECLAREd in the XPL source code, but prefixed by the names of all of the parent PROCEDUREs. Perhaps an example would make this clearer. Suppose your XPL source code looked like the following:

DECLARE X FIXED, Y FIXED, Z FIXED;

PROC1:
PROCEDURE(X, Y);
   DECLARE X FIXED, Y FIXED, Z FIXED;

   PROC2:
   PROCEDURE(X, Y);
      DECLARE X FIXED, Y FIXED, Z FIXED;
      ...
   END PROC2;

END PROC1;

Then the mangled variable and parameter names we'd find in our memory map (and just for reference, PROCEDURE names), as well as being used in our debugging functions, would be:

X
Y
Z
PROC1
PROC1xX
PROC1xY
PROC1xZ
PROC1xPROC2
PROC1xPROC2xX
PROC1xPROC2xY
PROC1xPROC2xZ

Thus while we have lots of global and local variables and function parameters named X, Y, and Z, their mangled names are all distinct.

Aside: By the way, the lower-case 'x' characters appearing in the mangled names have nothing to do with the fact that one of our identifiers is 'X'. That's a coincidence. Rather, they're just convenient separators XCOM-I conventionally uses between scope names and the variable names. Recall that XPL identifiers are case-insensitive. XCOM-I translates them all internally to upper case. Hence, lower-case 'x' is not a character that can appear in unmangled identifiers or names of scopes.

Patches for Insertion of Arbitrary Inline Code

A serious difficulty in working with XPL or XPL/I code is that they are often not written entirely in XPL/I, but also have insertions of inline machine code at various junctures.

For example, XCOM45 and HAL/S-FC source code both have such inlined code. However good XCOM-I may (or may not) be at translating XPL or XPL/I to C, it is not capable of inferring the intent of arbitrary assembly language, nor of translating such code into C with the level of reliability required for automating it. While it can help with such translation (as described in the next section), some amount of manual intervention is still required on the part of the coder. The basic principles underlying the framework provided for that are discussed in this section.

Aside: In reality, such inlined code is numerical in nature, and consists of IBM 360 machine code ... whereas the term "assembly language" implies that the insertions are symbolic in nature rather than just numbers. I'll continue to call the insertions BAL, because they undoubtedly began as BAL before somebody assembled them to determine which numerical values to use. But it is certainly technically incorrect to do so.

Of course, one possible approach would be to replace offending inline code directly by C code in the XPL source-code files of the program being compiled, using the property of the XCOM-I built-in INLINE function,

CALL INLINE('...arbitrary C code of no more than 256 characters...');

In your use case, whatever that may be, perhaps that's adequate. However, the Virtual AGC Project aims to provide a way to work with legacy code — i.e., code from the past, rather than newly-written code —, and that implies being able to compile such source code as-is, without syntactical or functional modification to source code. Which means that from our point of view, irreversible modification of legacy source code is unacceptable.

Aside: It is, of course, arguable the extent to which modifications are acceptable. The Virtual AGC Project allows insertion of modern comments, as long as modern comments are clearly distinguishable from original program comments. It has occasionally allowed clearly-commented simple changes of a line or two of code, when it was thought impossible for modern compilers to duplicate some undocumented behavior of the original compilers ... although improvements to the modern compilers almost always eventually allowed reversal of such modifications. My personal judgment is that replacement of dozens or hundreds of lines of code by code that's not obviously functionally identical, written in another language, is taking it farther than I'd like. Better to leave the source code the same, so that it's verifiable by the user.

Don't worry! XCOM-I lets us satisfy these criteria. But to understand how, you need to know a little more about this IBM Basic Assembly Language (BAL) code that's causing the problem.

Let's let a real-life example guide the discussion. The XPL source-code for SUNY's XCOM45 program — which as you'll recall is an XPL compiler from the State University of New York in the mid-1970's — contains a single block of 19 consecutive CALL INLINE statements that we need to work around if we want to compile and run a version of XCOM45 with XCOM-I:

SCAN_FINDS_END_OF:
   PROCEDURE(TABLE) BIT(1);
      DECLARE TABLE FIXED;
      DECLARE TRT(1) FIXED INITIAL("DD003000", "10000000");

      /* THIS PROCEDURE ASSUMES CP = 0 WHEN IT IS CALLED */
      CALL INLINE("58", 3, 0, TEXT);     /* LOAD STRING DESCRIPTOR */
      CALL INLINE("1B", 2, 2);           /* CLEAR REG. 2 */
      CALL INLINE("19", 3, 2);           /* CHECK FOR TEXT = NULL STRING */
      CALL INLINE("07", 8, 12);          /* RETURN FALSE IF TEXT IS NULL */
      CALL INLINE("8D", 2, 0, 0, 8);     /* SHIFT LENGTH FIELD TO REG. 2 */
      CALL INLINE("88", 3, 0, 0, 8);     /* RESTORE STRING ADDRESS  */
      CALL INLINE("58", 1, 0, TABLE);    /* LOAD BASE ADDRESS OF TRANS_TABLE */
      CALL INLINE("41", 12, 0, TRT);     /* LOAD ADDRESS OF TRT INSTRUCTION */
      CALL INLINE("44", 2, 0, 12, 0);    /* EXECUTE TRT INSTRUCTION */
      CALL INLINE("05", 12, 0);          /* LOAD BRANCH REGISTER  */
      CALL INLINE("47", 7, 0, 12, 18);   /* BRANCH ON CC~=0  */
      CALL INLINE("18", 1, 2);           /* LOAD REG. 1 WITH LENGTH(\) - 1 */
      CALL INLINE("41", 0, 0, 0, 1);     /* LOAD REG. 0 WITH 1  */
      CALL INLINE("1A", 1, 0);           /* ADD TO OBTAIN IDENTIFIER LENGTH */
      CALL INLINE("1B", 3, 3);           /* PUT 0 INTO RETURN REGISTER */
      CALL INLINE("47", 15, 0, 12, 24);  /* BRANCH AROUND NEXT TWO INSTR*/
      CALL INLINE("1B", 1, 3);        /* SUBTRACT TO OBTAIN IDENTIFIER LENGTH */
      CALL INLINE("41", 3, 0, 0, 1);     /* LOAD 1 INTO RETURN REGISTER */
      CALL INLINE("50", 1, 0, CP);       /* STORE IDENTIFIER LENGTH */
   END SCAN_FINDS_END_OF;

Actually, to be clear, XCOM-I will compile XCOM45 just fine, whether we work around this embedded assembly-language stuff or not; the compiled program simply won't run correctly, because vital IBM 360 instructions will be missing in the midst of the program. As it happens, though, part of the process of creating the workaround depends on us compiling XCOM45 in advance, without any workarounds in place. So let's just do that. In the compilation process, the XPL procedure SCAN_FINDS_END_OF turns into a C file called SCANxSCAN_FINDS_END_OF.c that contains the following harmless but useless C representation of it:

/*
  File SCANxSCAN_FINDS_END_OF.c generated by XCOM-I, 2024-05-19 12:14:14.
*/

#include "runtimeC.h"
#include "procedures.h"

bit_t *
SCANxSCAN_FINDS_END_OF(void)
{
  ; // (0) CALL INLINE( 88, 3, 0, TEXT);
  ; // (1) CALL INLINE( 27, 2, 2);
  ; // (2) CALL INLINE( 25, 3, 2);
  ; // (3) CALL INLINE( 7, 8, 12);
  ; // (4) CALL INLINE( 141, 2, 0, 0, 8);
  ; // (5) CALL INLINE( 136, 3, 0, 0, 8);
  ; // (6) CALL INLINE( 88, 1, 0, TABLE);
  ; // (7) CALL INLINE( 65, 12, 0, TRT);
  ; // (8) CALL INLINE( 68, 2, 0, 12, 0);
  ; // (9) CALL INLINE( 5, 12, 0);
  ; // (10) CALL INLINE( 71, 7, 0, 12, 18);
  ; // (11) CALL INLINE( 24, 1, 2);
  ; // (12) CALL INLINE( 65, 0, 0, 0, 1);
  ; // (13) CALL INLINE( 26, 1, 0);
  ; // (14) CALL INLINE( 27, 3, 3);
  ; // (15) CALL INLINE( 71, 15, 0, 12, 24);
  ; // (16) CALL INLINE( 27, 1, 3);
  ; // (17) CALL INLINE( 65, 3, 0, 0, 1);
  ; // (18) CALL INLINE( 80, 1, 0, CP);
  return 0;
}

Basically, XCOM-I has simply commented out each CALL INLINE statement, while replacing it with an empty statement. So why did we bother compiling it at all? The answer is that we want the parenthesized numbers that have been added to the comments! The parenthesized numbers will be the same every time we compile XCOM45, unless you take it upon yourself to mess us up by removing some of the CALL INLINE statements or adding new ones in the meantime. Pro tip: Don't do that!

Aside: There is even one (!) instance in HAL/S-FC in which there is a RETURN INLINE statement as well, and it is treated by XCOM-I the same way as CALL INLINEs are treated: namely, it is numbered and commented out. As it happens, McKeeman does not give even the slightest hint that INLINE can return a value. In point of fact, I don't think the value returned by the RETURN INLINE statement in the case I mention even comes from the INLINE at all, but is instead generated by other means. Technically, I think that if you have a block of CALL INLINEs terminated by a RETURN INLINE, then whatever value happens to remain in the CPU's R0 register will be returned, regardless of where in the block of INLINEs R0 was set up. But regardless of the underlying details, it demonstrates that syntactically, both CALL INLINE and RETURN INLINE are possible in XPL/I source code.

Important note: If you must write a C-language patch-file for a block of INLINEs that include an XPL RETURN INLINE(...), you should use the C macro RETURN() rather than simply return, or RETURN(something) rather than return something. That's because the RETURN(...) macro takes into account extra code which XCOM-I may need to insert in conjunction with XPL RETURN statements. For example, if XCOM-I is run with its --reentry-guard switch, then XCOM-I inserts code at the entry point and exit points of each XPL PROCEDURE that allows the runtime library to detect illegal reentries. (Recall that recursion is illegal in XPL.) The RETURN(...) macro handles this for you transparently, and if a simple return is used instead, then the runtime library will falsely detect recursion the second time any individual PROCEDURE is executed.

In this particular example, since XCOM45 has a single block of CALL INLINE statements, we were guaranteed in advance that the block would start with number 0. In a more-complex example like HAL/S-FC, there would be a number of blocks of inlines, strewn throughout the source code, and lieu of XCOM-I printing them all out for us the numbering would be unpredictable. But because the numbers never change, we can use them as unique, persistent identifiers for the CALL INLINE statements.

Why is that important? The way we're going to work around the CALL INLINE statements is to create a series of numbered "patch files", written in C: patch0.c, patch1.c, patch2.c, and so on. The patch files are placed in the same folder that contains the source code of the XPL program being compiled, though XCOM-I does have a command-line option (--patch=PATHNAME) that allows a different folder to be specified. When XCOM-I encounters a CALL INLINE statement, it looks for a patch file with the same number, and if it finds one, it directly inserts the contents of the patch file into the object code. If there's no matching patch file, then the CALL INLINE simply remains a comment. In principle, you could have a different patch file for each CALL INLINE, but it's usually more practical and more aesthetically pleasing to have a single patch file for each contiguous block of CALL INLINE statements.

Note: In working with the source code for HAL/S-FC specifically, it's a little more complex than what I just indicated. As I've mentioned previously, there are two different methods of compiling HAL/S-FC source code: One to create a compiler intended to compile Primary Flight Software (PFS) via XCOM-I's --cond=P command-line switch, and one to create a compiler intended to compile Backup Flight Software (BFS) via XCOM-I's --cond=B command-line switch. Unfortunately, the numbering of the CALL INLINE statements is a bit different in these two cases, which makes it unwieldy to rely on the generic naming patchN.c for the patch files. My description above is what happens when neither --cond=P nor --cond=B is used. But if --cond=P is used, then XCOM-I actually looks for patch files with names like patchNp.c first, falling back to patchN.c if patchNp.c isn't found. And similarly, if --cond=B is used, it prefers patch files named patchNb.c, falling back to patchN.c. This method allows the generic names patchN.c to be used up the point at which the numbering of the INLINEs begins to differ, switching to the patchNp.c/patchNb.c naming thereafter, without any ambiguity or conflicts between the two.

In this real-life example, since there's a single block of CALL INLINE statements, we need a single patch file, patch0.c.

/*
 * This is a C-language "patch" for CALL INLINE #0 in xcom4.xpl, as compiled
 * by XCOM-I.py.
 *
 * The patch spans the entirety of the `SCAN_FINDS` procedure embedded in
 * the `SCAN` procedure.  This comprises every `CALL INLINE` statement in
 * the entire xcom4 program.
 *
 * What the procedure does is this:  It's given the address of a "translation
 * table" of one of the following types:
 *      `BLANKTABLE`  (Blanks)
 *      `ALPHATABLE`  (Identifier)
 *      `STRINGTABLE` (Quoted string)
 *      `COMMENTABLE` (Comment)
 * Starting at position `CP=0` (already set up upon entry so that we know the
 * first character is of the asked-for type) in the string `TEXT`, searches
 * for the end of the pattern of the selected type, and updates `CP` to the
 * length of the found string.  It returns 0 on failure and 1 on success.  An
 * empty string counts as a failure.
 *
 * This was apparently coded in BAL because the IBM 360 has a translation-table
 * instruction, TRT, that does this efficiently.  But from my perspective,
 * considering that there's no other embedded BAL code in the entire program,
 * it seems as though it's a big price to pay, maintenance-wise and
 * portability-wise, for a pretty small gain in efficiency.  Yes, you can
 * always make assembly-language go faster than a high-level language; this is
 * news?
 */

static memoryMapEntry_t *mapCP = NULL, *mapTEXT = NULL, *mapTABLE,
      *mapBLANKTABLE, *mapALPHATABLE, *mapSTRINGTABLE, *mapCOMMENTABLE;
char *s, *TEXT;
uint32_t TABLE;

if (mapCP == NULL)
  {
    mapCP = lookupVariable("CP");
    mapTEXT = lookupVariable("TEXT");
    mapTABLE = lookupVariable("SCANxSCAN_FINDS_END_OFxTABLE");
    mapBLANKTABLE = lookupVariable("BLANKTABLE");
    mapALPHATABLE = lookupVariable("ALPHATABLE");
    mapSTRINGTABLE = lookupVariable("STRINGTABLE");
    mapCOMMENTABLE = lookupVariable("COMMENTABLE");
  }

TABLE = getFIXED(mapTABLE->address);
TEXT = descriptorToAscii(getCHARACTER(mapTEXT->address));

if (*TEXT == 0)
  return fixedToBit(1, 0);
if (TABLE == mapBLANKTABLE->address)
  {
    for (s = TEXT; *s == ' ' || *s == '\t'; s++);
  }
else if (TABLE == mapALPHATABLE->address)
  {
    for (s = TEXT;
         isalnum(*s) || *s == '_' || *s == '@' || *s == '#' || *s == '$';
         s++);
  }
else if (TABLE == mapSTRINGTABLE->address)
  {
    for (s = TEXT; *s && *s != '\''; s++);

  }
else if (TABLE == mapCOMMENTABLE->address)
  {
    for (s = TEXT; *s && *s != '*' && *s != '$'; s++);
  }
else
  abend("Requested translation table for SCAN_FINDS_END_OF not found");
if (*s == 0)
  {
    putFIXED(mapCP->address, s + 1 - TEXT);
    return fixedToBit(1, 0);
  }
putFIXED(mapCP->address, s - TEXT);
return fixedToBit(1, 1);

Now that we actually have a patch file, when XCOM-I again compiles XCOM45, the SCAN_FINDS_END_OF procedure is now translated to the following C code:

/*
  File SCANxSCAN_FINDS_END_OF.c generated by XCOM-I, 2024-05-25 14:05:21.
*/

#include "runtimeC.h"

descriptor_t *
SCANxSCAN_FINDS_END_OF(void)
{

  { // (0) CALL INLINE ( 88, 3, 0,TEXT);
    /*
     * This is a C-language "patch" for CALL INLINE #0 in xcom4.xpl, as compiled
     * by XCOM-I.py.
     *
     * The patch spans the entirety of the `SCAN_FINDS` procedure embedded in
     * the `SCAN` procedure.  This comprises every `CALL INLINE` statement in
     * the entire xcom4 program.
     *
     * What the procedure does is this:  It's given the address of a "translation
     * table" of one of the following types:
     *      `BLANKTABLE`  (Blanks)
     *      `ALPHATABLE`  (Identifier)
     *      `STRINGTABLE` (Quoted string)
     *      `COMMENTABLE` (Comment)
     * Starting at position `CP=0` (already set up upon entry so that we know the
     * first character is of the asked-for type) in the string `TEXT`, searches
     * for the end of the pattern of the selected type, and updates `CP` to the
     * length of the found string.  It returns 0 on failure and 1 on success.  An
     * empty string counts as a failure.
     *
     * This was apparently coded in BAL because the IBM 360 has a translation-table
     * instruction, TRT, that does this efficiently.  But from my perspective,
     * considering that there's no other embedded BAL code in the entire program,
     * it seems as though it's a big price to pay, maintenance-wise and
     * portability-wise, for a pretty small gain in efficiency.  Yes, you can
     * always make assembly-language go faster than a high-level language; this is
     * news?
     */
    
    static memoryMapEntry_t *mapCP = NULL, *mapTEXT = NULL, *mapTABLE,
          *mapBLANKTABLE, *mapALPHATABLE, *mapSTRINGTABLE, *mapCOMMENTABLE;
    char *s, *TEXT;
    uint32_t TABLE;
    
    if (mapCP == NULL)
      {
        mapCP = lookupVariable("CP");
        mapTEXT = lookupVariable("TEXT");
        mapTABLE = lookupVariable("SCANxSCAN_FINDS_END_OFxTABLE");
        mapBLANKTABLE = lookupVariable("BLANKTABLE");
        mapALPHATABLE = lookupVariable("ALPHATABLE");
        mapSTRINGTABLE = lookupVariable("STRINGTABLE");
        mapCOMMENTABLE = lookupVariable("COMMENTABLE");
      }
    
    TABLE = getFIXED(mapTABLE->address);
    TEXT = descriptorToAscii(getCHARACTER(mapTEXT->address));
    
    if (*TEXT == 0)
      return fixedToBit(1, 0);
    if (TABLE == mapBLANKTABLE->address)
      {
        for (s = TEXT; *s == ' ' || *s == '\t'; s++);
      }
    else if (TABLE == mapALPHATABLE->address)
      {
        for (s = TEXT;
             isalnum(*s) || *s == '_' || *s == '@' || *s == '#' || *s == '$';
             s++);
      }
    else if (TABLE == mapSTRINGTABLE->address)
      {
        for (s = TEXT; *s && *s != '\''; s++);
    
      }
    else if (TABLE == mapCOMMENTABLE->address)
      {
        for (s = TEXT; *s && *s != '*' && *s != '$'; s++);
      }
    else
      abend("Requested translation table for SCAN_FINDS_END_OF not found");
    if (*s == 0)
      {
        putFIXED(mapCP->address, s + 1 - TEXT);
        return fixedToBit(1, 0);
      }
    putFIXED(mapCP->address, s - TEXT);
    return fixedToBit(1, 1);
  }
  ; // (1) CALL INLINE ( 27, 2, 2);
  ; // (2) CALL INLINE ( 25, 3, 2);
  ; // (3) CALL INLINE ( 7, 8, 12);
  ; // (4) CALL INLINE ( 141, 2, 0,0,8);
  ; // (5) CALL INLINE ( 136, 3, 0,0,8);
  ; // (6) CALL INLINE ( 88, 1, 0,TABLE);
  ; // (7) CALL INLINE ( 65, 12, 0,TRT);
  ; // (8) CALL INLINE ( 68, 2, 0,12,0);
  ; // (9) CALL INLINE ( 5, 12, 0);
  ; // (10) CALL INLINE ( 71, 7, 0,12,18);
  ; // (11) CALL INLINE ( 24, 1, 2);
  ; // (12) CALL INLINE ( 65, 0, 0,0,1);
  ; // (13) CALL INLINE ( 26, 1, 0);
  ; // (14) CALL INLINE ( 27, 3, 3);
  ; // (15) CALL INLINE ( 71, 15, 0,12,24);
  ; // (16) CALL INLINE ( 27, 1, 3);
  ; // (17) CALL INLINE ( 65, 3, 0,0,1);
  ; // (18) CALL INLINE ( 80, 1, 0,CP);
  return 0;
}

Notice that the patch is automatically placed within a C {...} block, which is important if the CALL INLINE(...) happens to have been preceded by something like IF ... THEN or ELSE.

IBM 360 Inline Code

Let's start with the executive summary:

In this section, we revisit the problem of inline machine code discussed in the preceding section, describing a tool that may be of assistance in writing patch files for IBM 360 inline code. In brief, if you run XCOM-I with the command-line switch --guess=-1, it will create a set of files (guess*.c) that are XCOM-I's best guess of what the patch-files (patch*.c) should be like. (You can also use --guess=N₁,N₂,N₃,... to generate just the files for specific patch numbers, whereas -1 generates every patch.) You can then manually tweak these "guessed" files with (hopefully!) much-reduced effort to get patch-files you're satisfied with. There are three ways in which this guessing facility is known to fail, and thus to require manual fixes:

Only the IBM 360 instructions I've actually encountered in legacy XPL programs (specifically, XCOM3, XCOM4, and HAL/S-FC) are supported. At this writing, that's about ~10% of the full IBM 360 instruction set.
Self-modifying code, or pseudo-self-modifying code due to the use of the IBM 360 instruction EX, will not work properly. Translations of the EX instruction are specifically detected and marked with a comment of "***FIXME***". In my experience, most manual tweaks are related to these EX instructions, and specifically to the use of EX to execute modified MVC instructions; this is covered in more detail later.
Code which accesses or modifies CPU registers associated with subroutine linkage requires attention, because XCOM-I does not set up these registers upon entry to a procedure, nor does it use the contents of these registers upon exit from a procedure. Or more succinctly, C translations of XPL procedures do not follow IBM 360 calling conventions. This is rare problem in my experience.

That's the summary, which may be everything you actually need to know. Here's the longer explanation.

Having now gone through the process of manually writing many patch files for IBM 360 inline code, I've come to realize that I had overestimated the difficulty of translating that inline code to C code. With experience, I've now observed that:

The number of different IBM 360 instructions used in these is very limited, with just a handful of mostly-simple instructions making up the bulk of them.
All those instructions do is to read and write to memory and to a subset of the CPU's general registers, floating-point registers, and condition-code register.
And, of those particular CPU registers inlined in legacy XPL or XPL/I code, you can mostly assume that there are no side effects associated with doing so.

As part of a systematic framework for exploiting this, the XCOM-I the runtime library defines the IBM 360 CPU registers as the following global variables:

int32_t GR[16]; // CPU general-purpose registers.
double FR[16];  // CPU floating-point registers.
uint8_t CC; // CPU condition code register. Only the least two bits are significant.

Aside: If unlike me you're actually familiar with IBM 360 programming, I'm sure you'll call the CPU's general-purpose registers R0, R1, R2, and so on. IBM documentation calls them that. It also calls the floating-point registers, which are completely separate and independent of the general-purpose registers, by those very same names: R0, R1, R2, .... Perhaps it's just me, but I found that confusing. And because I was so stupid as to be confused by this apparently obvious and trivial point, it has cost me considerable amount of time. Consequently, I now make a point of calling the general-purpose registers GRn and the floating-point registers FRn. Yes, it's a departure from tradition. No, in this particular case I don't care that I've departed from tradition. A foolish consistency is the hobgoblin of little minds.

Meanwhile, memory can always be accessed and/or converted from IBM 360 formats to native C formats, also via XCOM-I runtime-library features:

uint8_t memory[0x1000000] — for reading and writing 8-bit bytes from/to memory.
int32_t COREWORD(uint32_t address) and void COREWORD2(uint32_t address, int32_t value) — for reading and writing 32-bit words from/to memory, respectively.
int16_t COREHALFWORD(uint32_t address) and void COREHALFWORD2(uint32_t address, int16_t value) — for reading and writing 16-bit half-words from/to memory, respectively.
double fromFloatIBM(uint32_t msw, uint32_t lsw) and void toFloatIBM(uint32_t *msw, uint32_t *lsw, double value) — for converting pairs of 32-bit words to/from C double-precision floats, respectively.
void setCC(void) and void setCCd(void) — for adjusting the condition-code register after integer operations or floating-point operations, respectively.

Additional globals are provided by the runtime library for holding commonly-needed temporary results:

int32_t address360A, address360B, dsw360, lsw360;
int64_t scratch;   // Results of integer operations; input to setCC (see above).
double scratchd;   // Results of floating-point operations; input to setCCd (see above).
int64_t dummy360;  // Use for anything you like.
double dummy360d;  // Use for anything you like.
double epsilon360; // A small number, used for tolerance in some floating-point comparisons.

For floating-point operations, native C formats and arithmetical operators are used. Conversion from/to IBM hexadecimal format occurs only when floating-point values are read from the memory array or written into the memory array. This is why the CPU's floating-point registers are modeled as the C double type. The alternative would have been to model them as pairs of uint32_t values, and to provide C functions that directly implemented exact IBM 360 arithmetic (like addition, subtraction, and so on) directly upon such pairs. I admit that wisdom of my decision is arguable, but have no present plans to change the implementation.

Aside: Actually, there is one case in which this decision causes problems, and that involves the IBM 360 instructions which perform "unnormalized" floating-point arithmetic. Recall that floating-point numbers have two parts, the exponent and the significand. For example, in 3.14156×¹⁰, the exponent is 10 and the significand is 3.14156. Almost all floating-point arithmetic on the IBM 360 is in "normalized" form, which means that the exponent and significant are adjusted so as to produce the maximum precision (number of bits) in the significand, given the number of bytes of storage allocated for the number in memory and the nature of the format used for storage. Thus in principle, 3.14156×¹⁰ is more precise than 0.03142×¹², and is less precise than 314.15654×⁸. In the case of IBM hexadecimal floating-point, the exponent field is 7 bits wide and represents a power of 64 (rather than 10), and is offset by 64. I.e., if the exponent is (say) E, then the powers represented are 64^-64 through 64⁺⁶³. The significand, meanwhile, is 56 bits wide and always positive, since there's a separate sign bit So in principle, an IBM hexadecimal double-precision floating-point number could represent exact positive or negative 56-bit integers. That'll be important in a minute.

Supplementing the instructions for normalized arithmetic, the IBM 360 includes instructions for both unnormalized addition and subtraction as well, which means they perform the indicated operations, but the results are taken as-is, without any normalization. Whereas the choice to use C double for all arithmetic in the XCOM-I framework means that all arithmetical results have maximum precision all the time.

Doesn't sound like a problem to you? Well, in practice CALL INLINE code uses a single unnormalized instruction, namely AW, which adds two floating-point numbers (normalized or unnormalized), producing an unnormalized result. It always does so in precisely the same manner, in order to perform the trick of converting a positive floating-point number (in IBM hexadeximal format) to a binary integer. In assembly language, that trick works as follows:

Somewhere — in HAL/S-FC, it's in the floating-point working area DW(6) and DW(7), pointed to by the variable ADDR_FIXER — a representation of the number 0.0 is stored as the pair of words 0x4E000000,0x00000000. (Note that there's no unique representation of 0.0 in IBM hexadecimal floating point, because the exponent is indeterminate when the signifcand is 0, but this particular choice of exponent is important.)

AW is used to perform an unnormalized addition of the ADDR_FIXER representation 0.0 with a non-negative floating-point number that represents an integer.

The unnormalized result of the addition has an exponent field that's 0x4E and a significand that's a non-negative 56-bit binary integer. Specifically, the least-significant word is the least-significant 32 bits of the integer.

The least-significant word, being the desired FIXED data, is immediately stored someplace via the IBM 360 instruction STD for whatever further use the program has for that integer.

You probably don't care, but the reason this trick works is that IBM floating-point addition functions by unnormalizing the operand with the smaller exponent to match the exponent of the other operand, and then performing a binary addition of the significands. But no 56-bit integer can have an exponent in IBM hexadecimal format that's larger than 0x4E, so the representation of 0.0 to have an exponent of 0x4E guarantees that it's the other operand that becomes unnormalized, and the way in which it's unnormalized is to be shifted rightward by exactly the right number of bits to turn into a binary representation of the integer.

Regardless of how it works, the reason this causes us a problem is that the native-C arithmetic we perform doesn't track any of this, so the AW is wasted, and doesn't produce an integer usable by the STD instructions that succeed the AW instructions. In other words, after the AW occurs, the CALL INLINE for the STD instruction that's supposed to store the 2's-complement integer has no 2's-complement integer to store!

The workaround for this in the XCOM-I runtime library, is that the floating-point registers are supplemented by an additional array that tracks whether the register should be treated as normalized or unnormalized, and then interprets the value stored in that floating-point register accordingly when an STD instruction is performed that stores the value in the register to memory:
double FR[16];
uint8_t unnormFR[16]; // 0 to treat FRn as normalized, 1 as unnormalized.
Thus an AW instruction on FR[n] sets unnormFR[n], while an STD instruction storing FR[n] resets unnormFR[n].

Yes, it's a bit of a hack, and if you were writing new CALL INLINE statements you could easily defeat it, but for right now it seems more economical than reworking the entire floating-point arithemetic system for a handful (less than 10) of uses of AW. My advice to you is this: If you're writing new XPL code, don't try to subvert my hack with perverted usage of the AW instruction in CALL INLINE statements; AW (and CALL INLINE, for that matter) is perverted enough already, and doesn't need your help.

Let's get the question of IBM 360 calling conventions out of the way. The conventional usage of the IBM 360 CPU's general registers in calling subroutines, as I understand it, is summarized in the bullet list below. But I've added a couple of apparently non-traditional extensions or differences (highlighted in green) that I've inferred from some of the CALL INLINE statements I've encountered in the legacy code; take them with a grain of salt!

GR0 — Upon exit holds the return value.
GR1 — Upon entry, holds the parameter (if there's only one) or else a pointer to a list of parameters.
GR3 — Upon exit holds the return value (rather than GR0).
GR12 — Upon entry holds the address at the end of the procedure's explicit code (prior to whatever linkage executes the actual return).
GR13 — Upon entry and exit, holds the address of an 18-word save area which holds the state (i.e., the general registers and other information) of the calling code. A procedure doesn't modify the contents of this save area, and must provide a different save area address to any procedures it calls itself
GR14 — Upon entry, holds the return address.
GR15 — Upon entry, holds the beginning address of the procedure itself.

As for how a translation from IBM 360 machine code to C uses the various concepts described above, consider the very common statement CALL INLINE("58", n, v), where n is a an integer 0-15 and v is the name of an XPL variable. This inline has IBM 360 opcode 0x58, which has the assembly-language mnemonic L, and means to load register GR[n] with the contents of variable v. Such an inline can be translated into C as:

GR[n] = COREWORD(ADDR(v));

Or consider CALL INLINE("41", n, v), which has the mnemonic LA, to store the address of the variable into the register:

GR[n] = ADDR(v);

Or CALL INLINE("50", n, v), mnemonic ST, to store the value of a register into a variable:

COREWORD2(ADDR(v), GR[n]);

In the 19-instruction patch used in the preceding section as an example, 6 of the instructions used are already of one or the other of these three types. Other instructions may have much-more-complex implementations in C than these, of course.

Recognizing that the majority of all CALL INLINE statements — though unfortunately not all of them — can be handled using these simple maneuvers, XCOM-I has a built-in tool that can create guessed versions of patch-files for you, in which most of the INLINEs have been translated, but with the ones that were too difficult for XCOM-I to handle left in the draft files for you to manually perfect.

To use this tool, simply run XCOM-I as normal, using either or neither of the --cond=P or --cond=B options, plus the --guess=... switch as I summarized it earlier. When doing so, in addition to the normal compilation outputs, files named guessN.c, guessNp.c, or guessNb.c are output as well. You can then make manual changes to the guess-files and rename them as patch-files when you're satisfied.

Regarding the effectiveness of the procedure, in the combined source code for the XPL programs XCOM3, XCOM45, and HAL/S-FC, there's a total of 81 blocks of CALL INLINEs, and thus in principle a total of 81 patch files. In generating guess-files for these programs, XCOM-I flags 11 of those guess-files with the comment ***FIXME***, which you can easily search for in order to fix the associated issue. Problems not detectable by --guess, though rare, unfortunately are neither marked nor easy to find or fix.

Aside: Regarding the process of manually fixing up the EX instructions not supported by XCOM-I --guess, it's actually not that hard once you understand the trick involved. Admittedly, right now it will seem simpler to me than to you, given our relative positions on this particular learning curve.

An EX instruction is syntactically of the form
EX	R1,D2(X2,B2)
where R1 is the number of a CPU general register, and D2(X2,B2) points to an IBM 360 machine code instruction. While it looks confusing, it's always easy to figure out which instruction is targeted, because in the XPL source code it will be set off by itself and there will be no way to reach it in C to execute it directly. Sometimes, the XPL source code will contain a binary representation of that target instruction, which makes the task a lot harder for you, but in the XPL I've seen it will almost always be represented as a CALL INLINE, which makes it much easier. The reason it's easy is because XCOM-I --guess will have made a guessed patch file for it, and will have printed the assembly-language syntax and the proposed C translation of it for you in the patch file.

To make what I'm saying concrete, consider an example taken from PASS1 of HAL/S-FC. Here's a small excerpt of the XPL source-code file HALINCL/VMEM.xpl:
         ELSE DO;                                                               00102100
            ADDRTEMP = ADDR(MOVECHAR);                                          00102200
            CALL INLINE("18",0,4);           /* LR 0,4                        */00102300
            CALL INLINE("58",2,0,INTO);      /* L  2,INTO                     */00102400
            CALL INLINE("58",3,0,FROM);      /* L   3,FROM                    */00102500
            CALL INLINE("48",1,0,LEGNTH);    /* LH 1,LEGNTH                   */00102600
            CALL INLINE("06",1,0);           /* BCTR 1,0                      */00102700
            CALL INLINE("58",4,0,ADDRTEMP);  /* L 4,ADDRTEMP                  */00102800
            CALL INLINE("44",1,0,4,0);       /* EX 1,0(0,4)                   */00102900
            CALL INLINE("18",4,0);           /* LR 4,0                        */00103000
            RETURN;                                                             00103100
         END;                                                                   00103200
      END;                                                                      00103300
MOVECHAR:                                                                       00103400
      CALL INLINE("D2",0,0,2,0,3,0);  /* MVC 0(0,2),0(3)                      */00103500
Those blocks of CALL INLINE statements correspond to patches 25 and 32 respectively. Thus XCOM-I --guess produces guessed patch files guess25p.c for lines 00102300-00103000 and guess32p.c for line 00103500. The latter is the target instruction for the EX instruction at line 00102900. It's visually obvious (since it's separated!), but you can tell it from reading the code too: Line 00102200 puts the address of the label MOVECHAR into the variable ADDRTEMP, while line 00102800 it moves it into the CPU general register GR4, and then at line 00102900 the EX instruction says that the base register GR4 holds the address of the target instruction. Here's how XCOM-I has done its best in guess25p.c to translate the EX instruction,
p25_24: ;
  // (31)       CALL INLINE("44", 2, 0, 3, 0);        /* EX  2,0(0,3)          */         
  address360B = GR[3] + 0;
  // Type RX, p. 7-74:		EX	2,0(0,3)
    // ***FIXME*** Unsupported opcode EX
and in guess32p.c the target instruction,
p32_0: ;
  // (32)       CALL INLINE("D2", 0, 0, 1, 1, 1, 0);  /* MVC 1(0,1),0(1)       */         
  address360A = GR[1] + 1;
  address360B = GR[1] + 0;
  // Type SS, p. 7-83:		MVC	1(0,1),0(1)
  mvc(address360A, address360B, 0);
In fact, in my experience, most EX instructions have targeted MVC instructions, as this example does, so this is great for covering the typical case!

Now, the action of an EX instruction is to replace bits 8-15 of the target machine-code instruction by the contents of the general register the EX instruction specifies, and then to execute that modified target instruction (without actually jumping to it). All you have to figure out is what in the C translation of the target instruction corresponds to bits 8-15 of the machine code. It will usually be something simple. For a target MVC instruction, I can tell you that it's just the final parameter of the mvc(...) function in the C translation. Thus, the change needed to the guessed file guess25p.c, in which the "***FIXME***" is replaced by actually-executable code, is this:
p25_24: ;
  // (31)       CALL INLINE("44", 2, 0, 3, 0);        /* EX  2,0(0,3)          */         
  address360B = GR[3] + 0;
  // Type RX, p. 7-74:		EX	2,0(0,3)
  address360A = GR[1] + 1;
  address360B = GR[1] + 0;
  // Type SS, p. 7-83:		MVC	1(0,1),0(1)
  mvc(address360A, address360B, GR[2]);

Not perfect, of course, but not bad either. By my reckoning, close to 90% of the guessed files seem to be perfect as-is. And of the files which are not perfect, probably around 90% of the CALL INLINEs translated in those files are still perfect.

Aside: As an example where XCOM-I fails to produce a perfect guessed result not due to an EX instruction, I can point to the single block of CALL INLINEs in the legacy program XCOM45. A number of its instructions involve use of the seemingly-uninitialized register GR12, as well as usage of the register GR3, that make sense (to me!) only if registers GR3 and GR12 are viewed as having a non-traditional conventional role in the subroutine linkage for calling their parent procedure (namely, SCAN_FINDS_END_OF): i.e., GR3 contains the return value of the procedure (rather than GR0 as documented by IBM), and GR12 contains the size of the procedure's code (other than the linkage code), in bytes.

As another example, I'd point to the one case I know of (in the INTEGER_VALUED procedure of program HAL/S-FC) in which a RETURN INLINE statement is used rather than a CALL INLINE statement. The INLINE XPL built-in is not specified in such a way as to return a value, but the calling procedure (INTEGER_VALUE) does check the returned value. Moreover, the INLINEs in INTEGER_VALUED do in fact manipulate the GR3 register in such a way as to contain either a 0 or a 1, without there being any other evident reason to do so, so it's easy to suppose once again that GR3 is indeed the return value.

This page is available under the Creative Commons No Rights Reserved License
Last modified by Ronald Burkey on 2024-08-13

Table of Contents

Introduction to XPL/I

Processing a Program Written in XPL, Using XCOM-I

Installation of XCOM-I

Linux or Mac

Windows

Testing the Installation

Compiling and Running XPL Programs

Emulating the IBM 360 CPU

Insane Bootstrapping!

Step 1: Compiling XCOMx source code with XCOM-I to produce XCOMx-native

Step 2: Compiling XCOMx source code with XCOMx-native to produce XCOMx-360

Step 3: Compiling XCOMx source code with XCOMx-360 under sim360 to produce XCOMx-360A

Step 4: Bonus Cross-Comparison of Files Built by XCOMx-360.obj Versus Files Built by Legacy XCOMx Executables

HAL/S-FC Through the Lens of XCOM-I

Compiling HAL/S-FC Itself

Using PASS1 to Process Programs Written in HAL/S

Correctness of PASS1

Processing with FLOWGEN

A Primer for Standard XPL and Intermetrics XPL/I

The Basics

Character Set and Modern Character Substitutions

Case Sensitivity

Source-Code Formatting

Identifiers

Datatypes, Declarations, and Literals

The Basic XPL Datatypes

XPL/I ARRAY and BASED Data ... and an Exception

LITERALLY and Macros

Perplexing Multiple Assignments

Logical Expressions

COMMON Memory

Memory Model for a Compiled XPL Program

Understanding COMPACTIFY

Structure of an XPL Program vs XPL/I

Compiler Directive Type: /?c ... XPL/I source code ... ?/

Compiler Directive Type: /%INCLUDE module %/

Compiler Directive Type: /* ...comment... $%module */

Compiler Directive Type: /**MERGE module procedure */

PROCEDUREs, RETURNs, and Their Peculiarities

Blocks and Loops

ESCAPE and REPEAT

Counter Value After Normal Loop Termination

Program Options

Built-In Runtime-Library Functions

The MONITOR Built-In Runtime-Library Function

Debugging XPL Programs

Patches for Insertion of Arbitrary Inline Code

IBM 360 Inline Code

Compiler Directive Type: `/?c ... XPL/I source code ... ?/`

Compiler Directive Type: `/%INCLUDE module %/`

Compiler Directive Type: `/* ...comment... $%module */`

Compiler Directive Type: `/**MERGE module procedure */`

`ESCAPE` and `REPEAT`