Aside: And in case you're wondering, there is no known surviving copy of Intermetrics's own XPL/I compiler. In fact, there is reason to believe that it was intentionally destroyed. (Prove me wrong, please!) If it had survived, however, it would presumably have been written in standard XPL and itself compiled by McKeeman's XPL compiler (which was called XCOM) or some later variant thereof.Even if you actually had a perfect compiler for standard XPL that you could use on (say) a Windows, Mac OS, or Linux computer, it would most likely neither compile most XPL/I programs, nor would most of those compiled programs run correctly afterwards if you were able to do so. But frankly, even having such a hypothetical perfect compiler for standard XPL is somewhat of a trick, due to the dependence on the underlying IBM 360 that's non-existent in Windows, Mac OS, and Linux.
What is this little example supposed to do? First, it declares 3 variables of thedeclare x fixed, y fixed, z fixed;
x(0) = 1;
x(1) = 2;
x(2) = 3;
output = x(0) || ' ' || x(1) || ' ' || x(2);
output = x || ' ' || y || ' ' || z;
fixed
datatype; fixed
means that x
, y
, and z
are 32-bit signed integers. It then proceeds to use x
as if it were a three-element array, though x
wasn't declared that
way, assigning values to each of its elements. The program simply assumes that x(0)
corresponds to x
, x(1)
corresponds to y
, and x(2)
corresponds to z
. Finally, the program "prints out" both the elements of x
(as an array) and the values of x
(as a non-array), y
, and z
. Naturally, those two lines of printout are identical. In case you're wondering, we could have used z(-1)
as an equivalent to y
, or z(-2)
as x
. For that matter, we could have used this feature to read or write areas of memory in which no
variables at all had been declared. Or to a certain extent, if we had declared
variables of some other datatype than FIXED
, we could have used this same trick to access them as if they were FIXED
.Aside: I say that subcripting sloppiness is "undocumented", but you could find passing hints to it in McKeenan if you already knew in advance about this feature by other means. True, McKeeman does say that array bounds are respected, which is false, but it does also say on p. 137 of McKeeman: "Assignments to subscripted variables are not checked against the array bounds; thus every memory location is accessible through subscripting." And if you pore through every line of XPL source code in the book, you'll certainly come across examples of subscripting sloppiness. Is that "documentation" of such a major feature? You be the judge.
Aside: Daniel Weaver has written an XPL-to-C translator. The first thing anyone suggests to me when this topic is discussed is "Why don't you just use it?" The subscripting sloppiness inherent in actual usage of XPL is one good reason. As you might expect, since there's no reasonable way to deduce from the documentation that subscripting sloppiness exists, there's reason to doubt that an XPL compiler relying on the accuracy of McKeeman might not support it. Here's a fun printout of the very-sensible error messages you get by running the sample code above through Dan's translator :
XPL to C language translator -- version 1.1
2 |x(0) = 1; |
|
*** Error, Identifier is not an array (detected at line 5741 in xcom). ***
3 |x(1) = 2; |
|
*** Error, Identifier is not an array (detected at line 5741 in xcom). ***
*** Last previous error was detected on line 2. ***
4 |x(2) = 3; |
|
*** Error, Identifier is not an array (detected at line 5741 in xcom). ***
*** Last previous error was detected on line 3. ***
5 |output = x(0) || ' ' || x(1) || ' ' || x(2); |
|
*** Error, Identifier is not an array (detected at line 5741 in xcom). ***
*** Last previous error was detected on line 4. ***
5 |output = x(0) || ' ' || x(1) || ' ' || x(2); |
|
*** Error, Identifier is not an array (detected at line 5741 in xcom). ***
*** Last previous error was detected on line 5. ***
5 |output = x(0) || ' ' || x(1) || ' ' || x(2); |
|
*** Error, Identifier is not an array (detected at line 5741 in xcom). ***
*** Last previous error was detected on line 5. ***
6 cards containing 7 statements were compiled.
6 errors (0 severe) were detected.
The last detected error was on line 5.
Aside: Daniel has also pointed out that aside from his own compiler, Dave Bodenstab wrote an XPL compiler for FreeBSD, which Daniel himself has ported to Linux. I have not tried it as of this writing.From my whining above about the quality (from a user's point of view) of the documentation of the XPL language, you'll probably recognize that it's hard to make a full list of all the difficulties of XPL or all of the differences between XPL and XPL/I. And just as McKeeman is not a full and accurate description of XPL, neither is the surviving Intermetrics documentation a full description of XPL/I. And confusingly for the present discussion, Intermetrics did not use an explicit term (like "XPL/I") to refer to their language. Instead they continued to refer to it simply as "XPL", making it hard to be entirely sure what's supposed to be different between the two languages and what's not.
Aside: The other suggestion I receive with a fair degree of regularity is, "Why not just run all of this IBM 360 software in a simulator like Hercules, and be done with it? Problem solved!" Well, for one thing, we don't have Intermetrics's XPL/I compiler, so we can't run it on an IBM 360 simulation. Lacking that, we can't build Intermetrics's HAL/S compiler in such a way that it could be run on an IBM 360 simulation. But if we had the HAL/S compiler and it could compile the PASS flight software to IBM 360 object code, why don't we just run that on an IBM 360 simulator? Well ... tell me how to integrate Hercules into a spacecraft simulator like Orbiter? And tell me how to motivate any of you to learn how to use Hercules in addition to the already very steep learning curve you face with the flight software? If you can answer those questions for me, I'll admit you've got a pretty good idea! With that said, there's definitely merit in emulating an IBM 360, if not necessarily emulating a full IBM 360 system. I'll come back to this point later
Aside: In case you're wondering, XCOM-I does translate the sample XPL/I program given above to C without error; the C program it creates also compiles without errors; and if you run the compiled C program you get what you might expect:There's one final complication: In legacy XPL or XPL/I source code, it was typical to intersperse the XPL or XPL/I (thankfully, sparingly!) with code written in IBM System/360 Basic Assembly Language (BAL). Several sections below discuss this topic.
PAGE 1
1 2 3
1 2 3
Despite the title of this section, there's actually little difference
between how to build and run XPL/I programs versus XPL programs.
Once you know how to build and run an XPL program, you'll automatically
know how to build and run an XPL/I program ... mostly. There's an extra
command-line switch you can use to specify that you really, truly want
XPL rather than XPL/I, namely --xpl
, but it usually doesn't matter much if you leave it out.
With that said, the XPL/I source code available to us dwarfs the XPL source code we have for any standard XPL programs we have. Moreover, those XPL/I programs are much larger and more complex than the XPL programs. This means that there are differences in how the source code for the two is organized and maintained. In that sense we find differences in how to deal with the two. That's why I concentrate at first on compiling standard XPL programs, and defer some discussion of XPL/I compilation until later.
Or more precisely, "installation of XCOM-I and other stuff needed to
follow the instructions given below".
pip3 install ebcdic
, or possibly pip install ebcdic
.pacman -S git diffutils gcc make python3 python3-pip vim
.pip3 install ebcdic
.git clone --depth=1 https://github.com/virtualagc/virtualagc.git
.vim ~/.bashrc
; if not, you can use another text editor that you're more familiar with, but outside of the Msys2 environment you'll have to know where to find .bashrc to edit it. With the default options in the Msys2 installation, I found it at c:\msys64\home\username\.bashrc and edited it with notepad. What you do is insert the line I've highlighted in green:.
.
.
# User dependent .bashrc file
# If not running interactively, don't do anything
[[ "$-" != *i* ]] && return
PATH=$PATH:~/virtualagc/XCOM-I:~/virtualagc/yaShuttle/ported/PASS1.PROCS
# Shell Options
.
.
.
--pp
for "pretty-printing" the C code output by XCOM-I no longer works.Aside: Actually, you can make
--pp
work, but I didn't suggest it above because it's such massive overkill. The--pp
option relies on the program clang-format, which you can get by installing the entire clang/llvm compiler infrastructure. Even I wouldn't bother. On the other hand, if you do, you'd also have the clang compiler to choose from as an alternative to gcc. Note that you'll also have to add it to the PATH, as in step #6 above. Instead of the line shown in step #6, insert this line into .bashrc:
PATH=$PATH:~/virtualagc/XCOM-I:~/virtualagc/yaShuttle/ported/PASS1.PROCS:/c/"Program Files"/LLVM/bin
Following the instructions on the remainder of this page essentially
tests the installation. However, in brief, what I do to verify
operation is:
cd
virtualagc
/XCOM-I
make -f Makefile-regression && make
cd ../yaShuttle/"Source Code"/PASS.REL32V0
make -s
make -s regression
Remember to use the path to where you installed the Virtual AGC source tree, in place of virtualagc.
Messages will be printed that give you a pretty clear idea as to success or failure. What's tested is:
Aside: You'd also be very mistaken to imagine that XCOM-I itself is user-friendly. I fear that there will be a lot of work involved — work which I may not even have the energy to perform — before the error messages XCOM-I spits out upon occasion can be mistaken for anything other than mere spit. Until then, try not to make any errors in your XPL programming.
Aside: Throughout this discussion, I use the Linux/Mac convention that the symbol '/' is used to separate the components of a filename and the path to the folder containing it. Windows uses the separator '\' instead, so in some places you may find that you need to replace '/' by '\'.
For the sake of discussion, suppose we wish to compile and run the sample program called Example-6.18.6.xpl. The first step is to use XCOM-I to translate Example-6.18.6.xpl into C source code:
This will create a sub-folder called Example-6.18.6/ of the current working folder, and put all of the files of the C translation into that folder.cd XCOM-I
XCOM-I.py Tests/Example-6.18.6.xpl
Aside: XCOM-I.py has various allowable command-line options, though none were needed in the invocation shown above. You can see a list of XCOM-I.py's command-line options with the command "XCOM-I.py --help
".
I won't bore you with a lengthy description of the C files output by XCOM-I, but it might be mildly instructive to glance briefly at a couple of them. For this example, the principal outputs are the files RANDOM.c and main.c, and here you can see a comparison of the original XPL file (left) to the C translation of them (right):
/* This is example XPL program 6.18.6 from McKeeman p. 157. |
/* |
This produces an executable program within the Example-6.18.6/ folder that's also called Example-6.18.6.make -C Example-6.18.6
Aside: On my system, which is a pretty-normal Linux system with gcc as the C compiler, the instructions I give you succeed basically without any compiler warnings. Whereas with clang as the C compiler there can be quite a few compiler warnings, and on a Mac OS system there may be an annoyingly-large number of compiler warnings ... all of which are nuisances without value, as far as I can tell, not to mention concerning things you can do nothing about. If that's what happens for you, I'd suggest turning off all compiler warnings by adding "EXTRA=-w
" to all of the
make
commands, as in:
make EXTRA=-w -C Example-6.18.6
Important note:But the world being what it is, you most likely don't have a setup like mine, so some alterations in the instructions above may be needed in your situation. For one thing, your C compiler may not be gcc. Actually, the default assumption isn't that the compiler is gcc, but rather that the shell's environment variablemake
will be fooled — badly, compiling multiple times with the wrong target names! — if the path to your current working directory contains any spaces. You can helpmake
out in that case by explicitly specifying the target name:
make -C Example-6.18.6 Example-6.18.6
CC
holds the
name of the compiler. But if that's not true, or if you're not using gcc, then the command for make
will differ a little. For example, suppose you use clang instead of gcc. Just change the build commands to read:Or if you're on Windows — and other than myself who isn't, really? — then you'd want to add the filename extension ".exe" to your executable. That could be done like so:cd XCOM-I
make -C Example-6.18.6 CC=clang
cd XCOM-I
make -C Example-6.18.6 TARGET=Example-6.18.6.exe
Aside: The Makefile itself lists other possible alterations, any or all of which can be used in combination with the others.Finally, you can now run the now-fully-compiled program:
What this particular sample program does is just to print 100 random numbers, so that's what should happen. (I say they're "random", but the seed for the random-number generator is hard-coded and always the same as 1, so you'll get the same 100 numbers every time you run the program.)Example-6.18.6/Example-6.18.6
Aside: Just like XCOM-I, the compiled application also has a variety of command-line options that may affect how it runs. None of them are really applicable to this particular sample program, but you could see them with the command "If you want to see a meatier standard XPL example, you can look instead at McKeeman's ANALYZER program. If you happen to have a copy of A Compiler Generator (see Section 7.1, Chapter 10, and Appendix 6), it provides reasonably-detailed operating instructions and theory of operation for ANALYZER. Basically, ANALYZER is used for analyzing a given BNF grammar and printing a report about it. Thus McKeeman recommends ANALYZER for developing BNF grammars and debugging them. There's an associated data file as well, SKELETON.bnf, which contains the BNF description for a very simple language (called "SKELETON", of course) that's the basis for some of the discussion in the book. The steps for compiling and running ANALYZER with SKELETON.bnf as input are just what you'd expect from the discussion earlier:Example-6.18.6/Example-6.18.6 --help
".
cd XCOM-I
XCOM-I.py --xpl Tests/ANALYZER.xpl make -C ANALYZER ANALYZER/ANALYZER <Tests/SKELETON.bnf
If you're interested, you can compare the report on the SKELETON grammar produced by the steps above, versus the report printed in Figure 7.1.6 of McKeeman in 1969. You will notice slight differences between the two, but not (I think) anything substantive. Remember that the report printed in the book would have been manually typeset, always with the possibility of human error, and with editorial decisions having been made to affect the aesthetics and publishing cost, possibly to the detriment of literal 100% accuracy.
Aside: Reducing discussion of ANALYZER to just whether or not it works the same today as it did back in 1970 is doing it a injustice, because ANALYZER has interesting capabilities in its own right. If you look at the reports I mentioned above, you'll notice that one thing ANALYZER includes in these reports is a large section consisting of XPLOne of the legacy XPL programs provided for demonstration purposes is actually a compiler for the SKELETON language, though calling it a "compiler" is a bit of a stretch, since all it does is analyze statements in the SKELETON language for correctness, and doesn't produce any object code. The idea behind the SKELETON program was that since it's such a small program (~300 lines of XPL), it would be easy to experiment with. The SKELETON language itself is very bare-bones, in that it only has assignment statements in it, and those assignment statements have the formDECLARE
statements. TheseDECLARE
statements, if plugged into the template XPL code provided elsewhere in McKeeman, are what's needed to create a compiler (written in XPL, of course) for the grammar being analyzed. In other words, the title of the book (A Compiler Generator) isn't a misnomer. This XPL code in the report isn't incredibly useful as-is, because it's formatted in a manner that's not immediately compilable. However, ANALYZER also allows you to "punch" separate punch-cards that do contain immediately-compilable XPL. More on that in a moment.
IDENTIFIER
= EXPRESSION;
.
Identifiers follow the same rules as they do in XPL, except that lower-case
letters aren't supported. There are no declarations, no arrays, no
strings, and "expressions" are basically whatever normal arithmetical
expressions you can form from just identifiers, literal decimal numbers,
and the tokens +
-
*
/
(
and )
. Anyway, you can compile and run it like so:SKELETON being relatively uninteresting, the Tests/ folder also contains a similar but more-complex BNF grammar for XPL itself. Analyzing the XPL grammar with ANALYZER is very slightly more complex than analyzing the SKELETON grammar was. That's because the XPL grammar file used as input for ANALYZER contains a directive which the SKELETON grammar did not. The culprit iscd XCOM-I
XCOM-I.py --xpl Tests/SKELETON.xpl make -C SKELETON SKELETON/SKELETON
$PUNCH
, which directs ANALYZER
to output the XPL code it generates to a "punch device".
Technically, that means a file has to be "attached" to the punch device,
or else the data has nowhere to go. But by default ANALYZER
doesn't attach any file to the punch device. On an IBM
System/360, this would have been handled by means of separately-provided
Job Control Language (JCL), and in particular by DD
statements in the JCL. Naturally, we have no JCL in our XCOM-I based system, but the equivalent to JCL would be ANALYZER's command-line switches. The command-line switches for simulating DD
statements are --ddi
and --ddo
.
Putting that all together, we can analyze the XPL grammar, with the
punch device attached to a file called PUNCH.txt, using the following
command:Or since ANALYZER accepts the input grammar it analyzes on "device 0" (attached by default toANALYZER/ANALYZER --ddo=2,PUNCH.txt <Tests/XPL.bnf
stdin
, which is why we usually pipe in the input via <
), we could even run it instead as:We'll come to a still meatier example in a moment, but before doing that, the next section will be a necessary digression.ANALYZER/ANALYZER --ddi=0,Tests/XPL.bnf --ddo=2,PUNCH.txt <Tests/XPL.bnf
XPL or XPL/I programs compiled by XCOM-I are native to your own local computer system, and if XCOM-I has done its job properly, there should be no lingering dependencies on the IBM 360 in the compiled program.
Nevertheless ... XPL was designed to be a program for
generating compilers, so a lot of legacy XPL or XPL/I code is actually
the source code of a compiler for one or another computer language.
Suppose for example that we had an XPL program called X that was a compiler for a programming language called "X". It's true that if we compile the source code for X with XCOM-I that we'll get executable code for X
that runs on our native Linux/Mac/Windows computer (rather than, say,
an IBM 360). But it's also true that if we use that native version of X to compile source-code in the X language, say for a program called Y, then the output of the compiler will still be an object file for the IBM 360. So we still wouldn't be able to run Y on our native computer, and would still need an IBM 360 to run Y.
Which means that even with XCOM-I in hand, it's still very useful to have an IBM 360 emulator in which to run not the programs produced by XCOM-I, but the programs produced by the compilers produced by XCOM-I.
If you're already an IBM 360 expert or a wannabe IBM 360 user, then perhaps a full-scale IBM System/360 emulator such as Hercules
might work well for you. As for myself, having tried to figure out without success how to use
Hercules, I have to concede that I'm neither a 360 expert nor
wannabe, and have no desire to confront that learning curve. I'd much
prefer a lightweight IBM 360 emulator, with essentially no
learning curve at all. (And a lightweight emulator is far more in line
with Virtual AGC's goal of integrating Space Shuttle computing support
into spaceflight simulation programs than a full-blown mainframe
emulator would be.)
As it happens, there is an available candidate for such lightweight IBM 360 emulator. This emulator, called sim360,
was written by the same Daniel Weaver who I've also mentioned earlier
as the author of an XPL-to-C translator for standard XPL. You can find
the
source code for sim360 in the Virtual AGC software repository, but the official place to get the most up-to-date version is Dan's own site.
That hyperlink is disguised as a Pascal compiler, which is also there,
but don't be confused: There's an IBM 360 emulator in there!
Aside: There is a limit, of course, to how "lightweight" an IBM 360 emulator can be while still remaining useful. To run programs originally written in XPL or XPL/I, for example, just emulating the CPU's instruction set is not enough. Such XPL or XPL/I programs depend on there being an operating system in place to allow operations like inputting data (from a files or keyboards) or outputting data (to a displays, printers, or files) to occur. But an XPL program doesn't actually interact directly with the operating system. Instead, an XPL program expects that there's a separate program called the submonitor running along side of it, and the XPL program makes its low-level requests for input, output, allocation of memory, and so on, to this submonitor program. In fact, there will be a whole section later on that describes XPL's built-in functions, some of which rely on the submonitor, as well as another section covering theMONITOR
procedure, all of whose functionality comes from the submonitor. But the point is that to be useful to us, a lightweight emulator like sim360 must provide enough submonitor capability to respond correctly when the XPL program makes these requests of the submonitor even though the IBM 360 operating system is entirely absent. Fortunately, although at the present writing it's a work in progress, sim360 does so. Thanks, Dan!
By the way, if in place of sim360 you were to use a full-scale IBM 360 simulator such as Hercules, an emulation of the XPL or XPL/I submonitor would not be built into it. Rather, you would have to assemble the IBM 360 Basic Assembly Language (BAL) source code for the submonitor, and presumably contrive the JCL needed to run the submonitor program along-side your compiled XPL program on the emulator. Fortunately, although we don't need it if we are using sim360, we do have source code for the submonitor:
- For McKeeman's XCOM, it's XMON.bal.
- For SUNY's XCOM 4.5, which we've not discussed yet but which we'll get to in the next section, it's the slightly-different XPLSM.bal.
- For Intermetrics's unavailable XCOM, it's the collection of BAL source-code files known as MONITOR.ASM. You may wonder why that matters, given that the source code for the Intermetrics version of XCOM has not survived, and thus we couldn't run it anyway? (After all, the whole point of XCOM-I in the first place is to fill in for the missing Intermetrics XCOM, and XCOM-I needs no submonitor!) Well presumably, if we compile HAL/S source code (like PFS or BFS) using HAL/S-FC, we'd expect PFS or BFS to need the same submonitor as the Intermetrics XCOM would have. Probably. Or at least, maybe!
What we don't have in this scenario is a BAL assembler, nor an IBM 360 linker program, so in spite of this wealth of source code for submonitors, in the end we still have no way to build the submonitor programs that I'm aware of.
Assuming you're working from Virtual AGC's archived copy, the first thing you have to do is to build sim360 itself as follows:
cd XCOM-I
make sim360
If you've added XCOM-I/ into your PATH as I recommended earlier, then sim360 will now be in your PATH as well.
If you want to try out sim360, you'll need an IBM 360 load file
for some actual program. In the next couple of sections, we'll
produce a lot of just load files. For now, I'll just point out the
IBM 360 load file for a simple Hello World program is available in our
Tests folder, and you can run it like so:
The command-line switches for sim360 probably look mysterious, but they're not so bad. You can see a full list of command-line options with the command "sim360 -o0ET stdout -i0AT stdin Tests/HELLO.obj
sim360 --help
", but what the mysterious switches in the example above mean is this:-o0ET stdout
: Causes OUTPUT(0)
to send data to stdout
and automatically translate it from EBCDIC to ASCII.-i0AT stdin
: Causes INPUT(0)
to receive data from stdin
and automatically translate it from ASCII to EBCDIC.Designation |
Notes |
Where I Got it From |
Legacy XPL Source Code for XCOM Proper |
Legacy XPL Source Code for Library |
Legacy IBM 360 Executable |
"Modern" Patch(es) for use with XCOM-I |
---|---|---|---|---|---|---|
XCOM3 |
This is the version of XCOM listed in the book A Compiler Generator. Printouts from it identify it as "XCOM III". In reality, there's no way to know if it's identical to the book without comparing it line by line, which I have not done. The notes in the download file state that the files were recovered from a legacy computer tape from the Stanford Computation Center, dated 1969-01-23. The contemporary notes on the tape itself hint that it is the source code from the book, but stop short of saying explicitly that it's the precise version that appears in the book. The contemporary notes further state that "THE BINARY IMAGE OF THE XCOM COMPILER ... [is] THE RESULT OF USING XCOM TO COMPILE ITSELF FROM THE SOURCE CARDS"However, if you run the binary executable, its printout says that it was built on May 2, 2001, at 13:40:40.19. I'd suggest it's rare to find files created in 2001 on tapes made in 1969. I'll give pretty-convincing reasons later to believe that while this version of the source code was compiled to get this executable, a different version of the compiler itself was used to do so. Regardless, how this 2001 executable made its way onto the 1969 tape, or at least into the download file, I cannot say. |
University of Toronto XPL page |
XCOM.xpl | XPL.LIBRARY.xpl | XCOM3-20010502.obj |
n/a |
XCOM45 |
The accompanying notes indicate that this is from
a computer tape from the State University of New York,
Stonybrook. The notes from the tape itself date it 1977-05-22. No information is given about how the binary executable of the compiler was created. Its printout identifies itself as "XCOM 4.5", compiled on July 19, 1976, at 21:05:40.16. Fortunately, unlike XCOM3, that's consistent with the date of the tape itself. But for reasons we'll come to later, I believe that while the provided executable is indeed compatible with the version of the XPL source code provided, it was nevertheless not used to compile itself. |
Dan Weaver's XPL project |
XCOM4.5.xpl | XPLIB4.5.xpl | XCOM45-19760719.obj | patch0.c |
Aside: To be clear, XCOM-I is not self-compiling. It's written in Python, not XPL. Rather, it's XCOM3 and XCOM45 that I expect to be self-compiling once they're bootstrapped. In the bootstrapping scenario, XCOM-I is that other compiler used to compile the initial version of XCOMx before an XCOMx executable exists. Except that unlike the normal bootstrapping scenario, XCOM-I, XCOM3, and XCOM45 are full-blown compilers rather than being the initial weak-and-feeble versions thereof.Also to be clear, the use case for XCOM-I I'm personally interested in is building HAL/S-FC and then forgetting that XPL and XCOM-I even exist. And we don't need a single one of these XCOMx programs to do that. But still, it's going to go much more smoothly if XCOM-I is believed to work properly in the first place. And if we can use XCOM-I for bootstrapping XCOMx, that will give us a certain degree of confidence that XCOM-I does in fact work as intended. And if in that process XCOM-I can be made useful for somebody who (unlike myself) actually wants to play around with XPL, then why not?
Compiler Variant |
Compiler's Own Source-Code Files |
Compiler Compiled With |
Compiler Runs on IBM 360? |
Compiler Runs on Linux/Mac/Windows |
Source-Code Language and Encoding Accepted By Compiler |
Object Files Directly Output By Compiler |
Encoding of Textual I/O When Running the Output Object Files |
---|---|---|---|---|---|---|---|
XCOM-I.py |
XCOM-I.py et al. |
n/a |
No |
Yes |
XPL or XPL/I, ASCII coding |
C-language files |
ASCII |
XCOM3-native |
XCOM.xpl, XPL.LIBRARY.xpl |
XCOM-I.py |
No |
Yes |
XPL, ASCII coding | IBM 360 load file | EBCDIC |
XCOM45-native | XCOM4.5.xpl, XPLIB4.5.xpl |
XCOM-I.py |
|||||
XCOM3-360.obj |
XCOM.xpl, XPL.LIBRARY.xpl | XCOM3-native |
Yes |
Only under sim360 or similar | XPL, EBCDIC coding | IBM 360 load file | EBCDIC |
XCOM45-360.obj | XCOM4.5.xpl, XPLIB4.5.xpl | XCOM45-native | |||||
XCOM3-360A.obj |
XCOM.xpl, XPL.LIBRARY.xpl | XCOM3-360.obj |
Yes |
Only under sim360 or similar | XPL, EBCDIC coding | IBM 360 load file | EBCDIC |
XCOM45-360A.obj | XCOM4.5.xpl, XPLIB4.5.xpl | XCOM45-360.obj | |||||
XCOM3-20010502.obj |
XCOM.xpl, XPL.LIBRARY.xpl | n/a | Yes | Only under sim360 or similar | XPL, EBCDIC coding | IBM 360 load file | EBCDIC |
XCOM45-19760719.obj |
XCOM4.5.xpl, XPLIB4.5.xpl | n/a |
|||||
XCOM3-360C.obj |
XCOM.xpl, XPL.LIBRARY.xpl | XCOM3-20010502.obj | Yes |
Only under sim360 or similar |
XPL, EBCDIC coding |
IBM 360 load file |
EBCDIC |
XCOM45-360C.obj |
XCOM4.5.xpl, XPLIB4.5.xpl | XCOM45-19760719.obj |
Aside: If you're on Linux or Mac OS and you don't have the patience to read through all of the garbage just mentioned, the Makefile provided with XCOM-I can also perform all 4 steps of the bootstrapping roadmap for you automatically. You could then just skip past the next four subsections. Unfortunately, I'm sure that the Makefile won't work in Windows as-is, but that it could be made to work. I'm not going to personally research the Windows situation, but you can look at the Makefile yourself to see what has to be done. If you figure it out, drop me a line and tell me about it. At any rate, the commands to bootstrap XCOM3 and XCOM45 using the makefile are as follows. It'll simply print messages at the end that it succeeded or failed:
cd XCOM-I
make
For XCOM3:
For XCOM45:cd XCOM-I
XCOM-I.py --xpl --output=XCOM3-native Tests/XCOM.xpl make -C XCOM3-native
cd XCOM-I
XCOM-I.py --xpl --lib-file=sim360-source/port/XPLIB4.5.xpl --output=XCOM45-native sim360-source/port/XCOM4.5.xpl make -C XCOM45-native
This is very much as described earlier, so I won't duplicate that discussion here.
XPL Source-Code File |
Compiled by XCOM3 for the book A Compiler Generator (1970) |
Compiled by XCOM3-native for The Virtual AGC Project (2024) |
---|---|---|
Tests/Example-6.18.1.xpl |
Report |
Report |
Tests/Example-6.18.2.xpl |
Report |
Report |
For XCOM45:cd XCOM-I
XCOM3-native/XCOM3-native --ddi=0,Tests/XCOM.xpl --ddi=2,XPL.LIBRARY.xpl --raf=B,3600,1,XCOM3-360.obj --raf=B,3600,2,XCOM3-360.dat --raf=B,3600,3,XCOM3-360.str >XCOM3-360.rpt
As you can see, this involves a bit of command-line complexity that step 1 did not. The command-line options are all there to compensate for the lack of Job Control Language (JCL), and in particular to provide replacements for missingcd XCOM-I
XCOM45-native/XCOM45-native --ddi=0,sim360-source/port/XCOM4.5.xpl --ddi=2,sim360-source/port/XPLIB4.5.xpl --raf=B,7200,1,XCOM45-360.obj --raf=B,7200,2,XCOM45-360.dat --raf=B,7200,3,XCOM45-360.str >XCOM45-360.rpt
DD
statements. For one thing, there's an extra file which
needs to be attached as an input, because XCOMx always needs to include the library file containing COMPACTIFY
. Also, XCOMx wants to use several "random-access
files" for storing object code and what not, and we have to attach
those files. (Refer to the FILE
entry in the section on XPL's built-in library functions.) Those random-access files are as follows:FILE(1)
— The compiled IBM 360 object code.FILE(2)
— A scratch file (i.e., temporary working space) for data.FILE(3)
— A scratch file for strings.DISKBYTES=3600
for XCOM3, but DISKBYTES=7200
for XCOM45.Aside: Incidentally, XCOM3 has plenty of quirks specific to it, quite aside from any quirks XPL more-generally may have as a computer language. XCOM3-native and XCOM3-360 inherit these quirks. I mention this just in case you become excited about using XCOM3-native and start writing new XPL programs for it! Here are a few quirks I've noticed:
- It only accepts upper-case for identifiers and keywords ... in spite of the fact that every speck of XPL source code in A Compiler Generator is printed in lower case! But I've provided a trick to work around that in XCOM3-native: In XCOM3-native's
--ddi
command-line switches that attach the XPL source-code files, you can optionally suffix them with ",U
". This tells XCOM3-native to automatically translate the associated files to upper case. It's not a perfect workaround, since it translates comments and quoted strings to upper case too, but that's life! We didn't actually need to do that for these source-code files, though, because they're already completely upper-case.- It requires an
EOF
token at the end of the source code. Other than correcting some EBCDIC-to-ASCII translation bugs that were present before I got some of the files, addition ofEOF
at the ends was my sole code change.
- It won't allow you to have two division operations (including
MOD
as a division) in the same statement, instead emitting an error message saying that it requires a "busy register".- ... and who knows what else?
The *.rpt files produced by these compilations are worth looking at,
if only to verify that somewhere in each of them is a line reading "NO
ERRORS WERE DETECTED".
I had hoped at this point to find that XCOMx-360.obj is byte-for-byte identical to the corresponding legacy executable XCOMx-YYYYMMDD.obj.
Unfortunately, my hopes have been dashed: They are
interestingly-similar but not identical. But as I mentioned
earlier, I expect to give convincing evidence that neither XCOMx-YYYYMMDD.obj was not actually self-compiled, and instead was presumably compiled by a different version of XCOM. If so, it would fully explain the mismatch versus XCOMx-360.obj.
But until I show you that evidence, it would certainly be reasonable to
maintain the possibility that the mismatch is due to a bug in XCOM-I.
For XCOM45:sim360 --size=3600 -o0ET stdout -i0AT Tests/XCOM.xpl -i2AT XPL.LIBRARY.xpl -f1wb+ XCOM3-360A.obj -f2wb+ XCOM3-360A.dat -f3wb+ XCOM3-360A.str XCOM3-360.obj
These command-line switches differ in detail from the ones we had to with XCOMx-native, but I think you can see that they're just the same old replacements for JCLsim360 --size=7200 -o0ET stdout -i0AT sim360-source/port/XCOM4.5.xpl -i2AT sim360-source/port/XPLIB4.5.xpl -f1wb+ XCOM45-360A.obj -f2wb+ XCOM45-360A.dat -f3wb+ XCOM45-360A.str XCOM45-360.obj
DD
commands, but employing a different syntax.TIME_OF_GENERATION
,
which returns the number of centiseconds since midnight. For
XCOM45-360.obj, that computes to
100*(19*3600+60*53+37.27)=7161727=0x6D477F. Similarly, for
XCOM45-360A.obj, 100*(17*3600+5*60+6.24)=6150624=0x5DD9E0, which
(surprise!) matches the screenshot above. (Recall that these IBM
360 programs all store their integer data in big-endian form, so that the most-significant bytes come first in memory.) In short, XCOM45-360.obj can indeed accurately build itself, and bootstrapping XCOM45 has been 100% successful!In step 2 above, I noted that XCOMx-360.obj, disappointingly,
is not byte-for-byte identical to the corresponding legacy IBM 360 executable load
file XCOMx-YYYYMMDD.obj. I had further speculated, but could not prove, that this was because
those legacy executables had themselves been built using different
versions of XCOMx. For example, perhaps XCOM45-19760719.obj was built by XCOM 4.4, or the perhaps the legacy XCOM3-20010502.obj was built by XCOM 3.2. Now comes the evidence!
Suppose that my speculation is true. In that case, if we rebuild XCOMx source code using XCOMx-YYYYMMDD.obj we should find that:
So let's try that. As usual, here are the instructions for doing the compilations:
For XCOM3:
For XCOM45:sim360 --size=3600 -o0ET stdout -i0AT Tests/XCOM.xpl -i2AT XPL.LIBRARY.xpl -f1wb+ XCOM3-360C.obj -f2wb+ XCOM3-360C.dat -f3wb+ XCOM3-360C.str XCOM3-20010502.obj
We could use the same technique as in the preceding section to check that XCOMx-360.obj is identical to XCOMx-YYYYMMDD.obj, taking into account the embedded timestamp. I'll let that as an exercise for the reader. But there's also an automated way to do the check, and obviously that's somewhat easier.sim360 --size=7200 -o0ET stdout -i0AT sim360-source/port/XCOM4.5.xpl -i2AT sim360-source/port/XPLIB4.5.xpl -f1wb+ XCOM45-360C.obj -f2wb+ XCOM45-360C.dat -f3wb+ XCOM45-360C.str XCOM45-19760719.obj
cmp
command is used), in which it just counts the number of mismatched
bytes, and concludes that if there were 3 or less mismatched bytes, then
there's an overall match. It doesn't bother to check that the
mismatched bytes are valid timestamps.HAL/S-FC actually consists of 7 passes, each with its own, standalone executable, and the instructions above (if they succeed), will thus create 7 new programs in the PASS.REL32V0/ folder: PASS1, FLO, OPT, AUX, PASS2, PASS3, and PASS4. You may or may not want to move these into your PATH. The reason you might not want to move them into your PATH is that a number of compilers written in XPL use this kind of naming pattern, and also have program names like PASS1, PASS2, and PASS3.cd yaShuttle/"Source Code"/PASS.REL32V0
make
COMMON
memory. It's
thus necessary not merely to run the passes in the correct order, but to
use all of the command-line settings properly to make sure that filenames match up from pass to pass. Pretty simple looking! Unfortunately, the simplicity ofHAL_S_FC.py SRN --hal=SOURCECODE >REPORT.txt
HAL_S_FC.py
was aided somewhat by the fact that I never intended to port any of the
other 6 compiler passes into Python, and hence didn't have to worry
about the interconnections between the compiler passes. In
contrast, here's a sample of how to run PASS1 as created by XCOM-I to process that same HAL/S program:There's obviously a bit of added complexity here, or at least the appearance of added complexity, so let me go through these things one-by-one:./PASS1 --parm="SRN" --pdsi=5,ERRORLIB --pdsi=6,ACCESS --ddi=0,SOURCECODE.hal --raf=B,7200,1,HALMAT.bin --raf=B,1560,2,LITFILE.bin --raf=B,3360,6,VMEM.bin >REPORT.txt
./PASS1
is, of course, the program PASS1 of HAL/S-FC
itself. The "./" prefixing is to help some versions of Linux that
don't automatically look in your current working folder for program
names. At the very least, it's harmless in Linux or Mac OS.
Leave it off in Windows! --parm="SRN"
:
The --parm='...'
switch in general is used to pass options to the compiler, replacing the PARM='...'
that originally appeared in Job Control Language (JCL) on the IBM 360. The numerous possibilities for inclusion in --parm='...'
are discussed later.
But specifically, SRN tells the compiler that there are serial numbers
in columns 73-80
of the "cards" containing your source code, or more-precisely, that
there is no source code in columns 73-80. I typically put all my
source code in columns 1-72, just because I think it's a good habit in
HAL/S. But you can use the entirety of columns 1-80 if you leave
off this command-line option. Of course, if you did not write the
HAL source code yourself, you don't really have any choice, and have to
stick with whatever's consistent with the actual source code. As
for columns 81 and above, you cannot use them for HAL/S source code by
any means whatever. Too bad!--pdsi=...
, --pdso=...
, --ddi=...
, --ddo=...
, and --raf=...
switches, in general, all replace the DD
commands that originally appeared in JCL on the IBM 360: i.e.,
they associate specific filenames and (to a certain extent) file types
with the device numbers used in XPL source code INPUT(n)
, OUTPUT(n)
, and FILE(n)
functions.Among themselves, the seven passes of HAL/S-FC use a
somewhat-consistent scheme for associating filenames to file numbers,
though not all of the passes necessarily use all of the files. I
say "somewhat", because files output by one pass are sometimes input to
another pass using a different file number. A useful though seemingly incomplete reference to the various files, is the table on p. 7-4 of the "HAL/S Compiler System Specification".
Here's a handy table
that summarizes the file usage of each pass. When you see
something italicized, such as HALMAT
in --raf=B,7200,1,HALMAT.bin
,
I mean to imply that you may or may not want to choose a different name
than I have, or to use different names for different compiler passes;
however, the names must be consistent between the passes outputting them
and the passes using them as input. The cells that have been highlighted in color emphasize those places in which the same
file must be used on successive passes; in the case of the
virtual-memory file, I don't know if the same file is used on successive
passes or not, so I haven't highlighted it, but I've been using the
same file without any apparent difficulty.
HAL/S-FC Usage |
PASS1 |
FLO |
OPT |
AUX |
PASS2 |
PASS3 |
PASS4 |
Notes |
---|---|---|---|---|---|---|---|---|
Input HAL source code |
--ddi=0,SOURCECODE.hal or stdin |
n/a |
n/a | n/a | n/a | n/a | n/a | |
Primary output listing |
--ddo=0,REPORT.txt or stdout |
--ddo=0,REPORT.txt or stdout |
--ddo=0,REPORT.txt or stdout |
--ddo=0,REPORT.txt or stdout |
--ddo=0,REPORT.txt or stdout |
--ddo=0,REPORT.txt or stdout |
--ddo=0,REPORT.txt or stdout |
|
Output error messages |
stderr |
stderr |
stderr |
stderr |
stderr |
stderr |
stderr |
|
Extra listing file, plus SPACELIB error messages |
--ddo=2,LISTING2.txt |
--ddo=2,LISTING2.txt |
--ddo=2,LISTING2.txt
|
--ddo=2,LISTING2.txt
|
--ddo=2,LISTING2.txt
|
--ddo=2,LISTING2.txt
|
--ddo=2,LISTING2.txt
|
This file is produced by PASS1 if "LISTING2" is among the options in PARM='...', and I don't show it in the sample PASS1
invocation above, because LISTING2 isn't among the chosen PARM
options. However, SPACELIB directs certain error messages to it as
well, so in some ways it's better to have it even if LISTING2 isn't
specified. |
AP-101S object module output |
n/a |
n/a | n/a | n/a | (PASS) (BFS)
|
For the PASS compiler, I think that this is in the IBM 360 "card" format. For the BFS compiler, I'm unclear what this is. I think, perhaps, that there's a PDS member for each "object" (COMPOOL, PROCEDURE, PROGRAM) compiled, and if you don't clear the PDS, it will be every object for every compilation you've compiled since the last time it was cleared. Which makes sense. However, I don't understand the data in these objects. |
||
Object module output ("deck") Input from template library |
--pdsi=4,TEMPLIB,E |
n/a | n/a | n/a | --ddo=4,DECK.bin,E |
(PASS2) For the PASS compiler, an optional copy of CARDFILE . For the BFS compiler, the only copy.(PASS1) The template library is similar to the collection of header files in C. Used only if TEMPLATE is present in the parameter string. |
||
Simulation Data File (SDF) output Input error-message library |
--pdsi=5,ERRORLIB |
n/a | n/a | n/a | --pdsi=5,ERRORLIB |
The
compiler requires access to the library of HAL/S error messages, which
is the folder PASS.REL32V0/ERRORLIB/. Note that "ERRORLIB" is
literally its name, and it's all you need in the command-line option if
you're running the compiler from the PASS.REL32V0/ directory. But
if you're running the compiler from a different directory, you need an
absolute or relative path to ERRORLIB. |
||
Input access-rights library Output to template library. |
--pdsi=6,ACCESS |
n/a | n/a | n/a | n/a | n/a | n/a | The compiler also requires access to the library of access rights,
namely the folder PASS.REL32V0/ACCESS/. The folder is empty, yes, because you
have full access and there are no restrictions. But it's still
required! Note that "ACCESS" is literally its name. Like
ERRORLIB (see above), just using ACCESS in the command-line option is
fine only if you're running the compiler directly from the PASS.REL32V0/
folder. Used only if TEMPLATE is
present in the parameter string. Note that the output template
library should not necessarily be the same as the input library on
device 4. |
Pseudo-assembly listing output |
n/a | n/a | n/a | n/a | --ddo=7,EXTRA.txt |
|||
TBD |
--pdso=8,TEMPINC
|
|||||||
Source code comparison |
TBD |
|||||||
Output or input HALMAT |
--raf=B,7200,1,HALMAT.bin |
--raf=B,7200,1,HALMAT.bin |
--raf=B,7200,1,HALMAT.bin |
--raf=B,7200,1,AUXMAT.bin |
--raf=B,7200,1,AUXMAT.bin |
n/a | n/a | This is a "random-access file" storing "HALMAT". For some passes (such as PASS1) this is output; for others (such as FLO) it is input. When output, it doesn't need
to
previously exist. Recall that HALMAT is the name of the
"intermediate language" into which PASS1 converts your HAL/S source code. This file is the principal output of PASS1, and is what is sent along to the optimizing passes, some of which modify the HALMAT, and eventually to PASS2
to be converted to IBM 360 or AP-101S object code. Alas, due to
the apparent non-survival of adequate documentation related to HALMAT,
this file is presently incomprehensible even when translated from its
binary format into a somewhat-human-readable form. |
Literal file |
--raf=B,1560,2,LITFILE.bin |
--raf=B,1560,2,LITFILE.bin |
--raf=B,1560,2,LITFILE.bin |
n/a | --raf=B,1560,2,LITFILE.bin |
--raf=B,1560,2,LITFILE.bin |
n/a | This is a "random-access file" that the compiler uses to store the
values of literals. Note: The value of a literal character-string (or a long bit-string) is actually a 32-bit string descriptor, as opposed to the sequence of individual bytes comprising the string data. These string descriptors in the literal file refer to the areas in memory in which those string-data bytes are stored. So the literal file is useful only as long as the string data memory remains intact, and has not been moved by compaction. |
Phase 1 Init/Const work area, Phase 2 code generation work area |
--raf=B,1560,3,ICFILE.bin |
n/a |
n/a |
n/a | --raf=B,1600,3,OBJFILE.bin |
n/a | n/a | By ICFILE , I mean the INITIAL/CONSTANT queue. I think that the compiler can only hold
a certain number of initializers in memory at once, and this file is
used to swap any overflow of the initializers from memory.By OBJFILE , I mean a work-file that accepts data
related to object-code generation when it overflows the provided memory
buffer for it. It is not object code as such. |
Phase 2 HALMAT work file |
n/a |
n/a |
--raf=B,7200,4,OPTMAT.bin |
--raf=B,7200,4,OPTMAT.bin |
--raf=B,7200,4,OPTMAT.bin |
n/a |
n/a | Secondary HALMAT file. As you can see, we somehow end up with three separate HALMAT files:
Is |
Phase 3 paging area |
n/a | n/a | n/a | n/a | n/a | --raf=B,1680,5,SDF.bin |
n/a | |
Statement data communication area |
--raf=B,3360,6,VMEM.bin |
--raf=B,3360,6,VMEM.bin |
n/a |
n/a | --raf=B,3360,6,VMEM.bin |
--raf=B,3360,6,VMEM.bin |
n/a | This is a "random-access file" that the compiler uses to store blocks of virtual
memory which have been temporarily unloaded from memory. |
Input COMMON file |
n/a |
--commoni= COMMON1 .out |
--commoni= COMMONf .out |
--commoni= COMMONo .out |
--commoni= COMMON a.out |
--commoni= COMMON2 .out |
--commoni= COMMON 3.out |
On the IBM 360, certain areas of memory, known as COMMON ,
were expected to retain their contents when one pass of the compiler
was unloaded from memory, and the next pass loaded into it. Of
course, this will not happen any any computer you'll be likely to use,
and XCOM-I supports COMMON differently than that. Instead, each pass writes out COMMON as a file upon termination, and (optionally) inputs COMMON
from a file upon startup. By default, output is to a file
literally called "COMMON.out", while input is disabled. For PASS1, these defaults are reasonable. However, each subsequent pass wants to take COMMON from the preceding pass, so the default input is no longer good. |
Output COMMON file |
--commono= COMMON1 .out |
--commono= COMMONf .out |
--commono= COMMONo .out |
--commono= COMMONa .out |
--commono= COMMON2 .out |
--commono= COMMON3 .out |
n/a |
See the comment above. It is, of course, possible to omit all of the --commono options entirely, and to use --commoni= COMMON .out everywhere. This is perfectly workable, and its only drawback is the lose of transparency of what happens between passes. |
As discussed in the preceding section, our goal is to be able to compile HAL/S programs by means of HAL/S-FC (as compiled itself by XCOM-I), but we can be assisted somewhat in reaching that goal by taking advantage of work done prior to the effort of creating XCOM-I. Specifically, before writing XCOM-I, I had ported PASS1 of HAL/S-FC from XPL to Python. This Python version of PASS1 is called HAL_S_FC.py. In an ideal world, we'd expect PASS1 and HAL_S_FC.py
to produce results that were identical in all respects, except for
things like (say) timestamps. Given that this isn't an ideal
world, as far as we know, and in particular neither I nor my creations
are perfect, we should instead expect some discrepancies. In fact,
discrepancies are very useful, as they highlight bugs in one or another
of the two versions, and correcting those bugs helps me to push both
versions closer to validity. Nevertheless, some discrepancies are
rooted in the very fact that theses two versions are implemented using
very different underlying principles and tools, and cannot be removed by
any reasonable means.
What kinds of discrepancies can we expect to find between PASS1 and HAL_S_FC.py?
I haven't done any systematic testing at this point, so I can only
comment on discrepancies I've observed. We can categorize them
somewhat by the output files produced by the programs:
File Type |
Output File in HAL_S_FC.py |
Output File in PASS1 |
Notes |
---|---|---|---|
Compiler Report |
Printed to stdout |
Printed to stdout |
|
HALMAT |
FILE1.bin |
Set by --raf=B,7200,1,HALMAT.bin . |
No discrepancies! |
Literals |
FILE2.bin |
Set by --raf=B,1560,2,LITFILE.bin . |
FIXED and BIT(<33) literals in the file are identical for PASS1 vs HAL_S_FC.py. However, string descriptors for CHARACTER and BIT(>32) literals are different, because HAL_S_FC.py manages string data as Python strings, and thus string data is not stored in the identical memory locations used by PASS1. |
Virtual Memory |
FILE6.bin |
Set by --raf=B,3360,6,VMEM.bin . |
They differ, because HAL_S_FC.py does not implement HAL/S-FC's virtual-memory module at all. The virtual-memory file produced by HAL_S_FC.py is in fact empty. |
It is a truth universally acknowledged that there is no satisfactory
introductory information available concerning programming in XPL.
Or at least, it would be universally acknowledged if anybody had ever heard of XPL and wanted to use it for anything.
The best you can do, generally speaking, is to purchase a used copy of McKeeman (i.e., A Compiler Generator).
If you do, you'll find a book that's densely packed with information,
but that information is the source code for an XPL compiler (written in
XPL), lots of BNF descriptions of the language, lots of theory on how to
write a compiler-generator program, and very little of direct interest
to a programmer who wants to come up to speed quickly on how to write or
understand a program written in XPL. Not to mention the fact — though I
am mentioning it! — that some of the most-critical
counter-intuitive information is buried in easy-to-miss,
easy-to-misunderstand comments made in passing, rather than as big,
bold-face warnings. And as a bonus, the book provides an index of
almost no use at all to a newby XPL programmer. Besides which,
most online information about XPL, in my experience, is a simple abridgement or other rehashing of A Compiler Generator,
and adds little extra of value in a tutorial sense, since it's almost
never written by anybody actually working with XPL. With that
said, you may find some useful online information in a couple of places:
And as for Intermetrics XPL/I ... well, from a tutorial perspective, it's orders of magnitude worse. Perhaps later, non-surviving documentation did a better job. Enough said!
Taking all of that into account, it might be reasonable to provide a full tutorial here how to write XPL or XPL/I programs. Perhaps I'll do that sometime. It turns out that that's easier said than done, since as you may have noticed, simplification for beginners is not really my personal strong suit. Which is ironic, considering my strong criticism of A Compiler Generator above! For now, I'll just cover some of the basics and quirks of the language(s). Send in suggestions for improvement, if you like; I'm sure I can use them somehow to make the discussion even worse.
The most basic characteristic of a language is the character set in which the language is expressed. Neither McKeeman nor Intermetrics specifies the character set. I've given it a lot of thought, and my conclusion is that the originally-supported character set was:
<space>
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9
_ % + - * . / | & = < > # @ $ , ; : ( ) ' " ! ? [ ] { }
¢ ¬ <eof>
Here, by <space> I mean a space character, and by <eof> I mean an end-of-file character.
You'll notice that several characters common today were not originally supported, including:
` ~ ^ \
Contrariwise, there are three characters (¢, ¬, and <eof>) that don't exist at
all in the 7-bit ASCII character set that's the common core for the
character sets typically used today when writing software source
code. Therefore, when working with XCOM-I, we use the fortunately otherwise-unused ASCII characters in place of the unsupported characters:
Aside: The <eof> character — not to be confused with the
EOF
token that appears at the end of the source code in an XPL or XPL/I program — is an unusual case. It is not a special character in XPL programs. However, it is used as a bookkeeping device by HAL/S compilers, or at least by HAL/S-FC, in padding that the HAL/S compiler automatically appends to a stream of input HAL/S source code. Thus we have to have a way of representing it in the XCOM-I framework, even though it's not a character anyone would ever have occasion to use in either an XPL or HAL/S source-code file, unless that file was itself part of a HAL/S compiler! To the best of my knowledge, <eof> appears only within a single string variable,INPUT_PAD
, that occurs in the HAL/S-FC source-code file STREAM.xpl. As far as I can tell, it is not even a standard EBCDIC character, but rather is something chosen for some reason unknown to me by HAL/S developers. See the comments forINPUT_PAD
in STREAM.xpl.
These substitutions allow
us to completely translate the original XPL or XPL/I source code
back-and-forth between the original EBCDIC and ACII without any loss of
information, and without insisting that you adopt any specific "locale"
like ISO-8859-15 or UTF-8 for your computer that's not optimal for your
other (non-XPL'ing) activities. With that said, I hope that you could use ¢ and ¬ in your XPL/I source code, if you insist on doing so, but I do not guarantee it.
Aside: All previously-existing XPL or XPL/I source code I've found, or HAL/S source code for that matter, has already substituted ~ or ^ for ¬ anyway. And indeed, some early HAL/S documentation suggests this very substitution. I suspect that's because some IBM printers at the time printed ~ in place of ¬. But whatever the explanation, the substitutions I'm suggesting are not exactly daring in their originality. As far as the ¢ symbol is concerned, it's not actually used in active XPL or XPL/I code, as such, but can be used in program comments to toggle various compiler options on and off, or in principle could appear within quoted strings.
XPL programs are not case-sensitive, except in so far as the
contents of quoted strings are concerned. E.g., lower-case or
mixed-case symbols are treated as being identical to their upper-case
correspondents, but quoted strings are case-sensitive.
Input to an XPL or XPL/I program (via the INPUT
built-in function) is expected to conform to computer punch-card-like
conventions. I.e., input lines are always exactly 80 characters long,
and XCOM-I enforces this by truncating or right-padding input
lines as necessary. If the input lines are longer than 80 columns
physically — say, because they have punch-card sequence numbers in
columns 81-88 — the extra columns are stripped off.
As for the source-code proper, other than being confined to columns
1-80, it is entirely free-form. I.e., line breaks are ignored; several
statements may exist on the same input card, or conversely, a single
statement may be split across multiple cards. Even though XPL CHARACTER
strings are limited to 256 characters, there is seemingly no limit to the length of an XPL statement.
There are hints in the error messages of McKeeman's XCOM (and I believe, in A Compiler Generator text) that some
contemporary computer systems may have treated column 1 specially,
perhaps accepting some kind of non-blank control characters there. If
so, it was a issue outside XCOM proper and is irrelevant to XCOM-I.
However, you do find that legacy XPL source code does often begin in
column 2 rather than column 1, and I suspect that this hypothetical
column-sensitivity is the reason for it.
Regarding this column-1 ambiguity, however, there is the practical
question of what to do when a long quoted string is split across
multiple cards. Does column 1 belong to the string or not? In XCOM-I, I take my cue from McKeeman's source code for XCOM in this matter: In spite of the fact that XCOM's source code generally avoids column 1 in all other cases, column 1 does belong to any multi-line quoted strings.
Identifiers cannot begin with a digit, but otherwise consist of any sequence of upper- or lower-case alphanumeric characters, or any of the characters @ _ # $. For example, @_26$8ab# is a legal name for a variable.
Identifiers cannot exceed 256 characters in length.
There are only three basic datatypes:
FIXED
is a 32-bit signed
integer. (Stored as 2's-complement, in big-endian byte order, vs
the little-endian byte order used in most personal computers today.)CHARACTER
is a variable-width character string, with a string-length limited to
256 or less. Strings are stored as a 32-bit unsigned integer known
as a descriptor, paired with a separate area from 1 to 256 bytes
containing the individual characters of the string, encoded in
EBCDIC. The descriptor has 8 bits specifying the string length
(minus 1) and 24-bits providing the starting memory-address of the
character data.BIT(n)
, where n
is from 1 to 2048, is an n
-bit object. The amount of storage varies by the precision:BIT(1)
through B(8)
are stored in memory as single bytes.BIT(9)
through BIT(16)
are stored as 2-byte "half-words".BIT(17)
through BIT(32)
are stored as 32-bit words.BIT(33)
through BIT(2048)
are stored similarly to CHARACTER
variables: There's a 32-bit "descriptor", of which 8 bits is the number
of bytes needed to store all of the bits, minus 1, and 24 bits area
pointer to elsewhere in memory, where the bytes themselves are stored.
Thus, a long BIT(n)
like this uses up 4 bytes for the descriptor, plus ⌊(n+7)/8⌋ bytes (5 for n=33 through 256 for n=2048) for the data. Aside: The method for storing
CHARACTER
data described above leaves no room for 0-length "empty" strings. But the XPL and XPL/I languages do allow for empty strings: p. 207 of McKeeman tells us that an empty-string is represented by a string-descriptor with the value 0x00000000, with no extra memory allocation for the non-existent "data" of the string. This isn't ambiguous, by the way. While 0x00000000 technically appears to be a descriptor for a 1-byte string whose data is located at address 0x000000, in fact address 0x000000 would always have been outside of the block of memory dedicated for storage of EBCDIC string data, rendering a descriptor of 0x00000000 unusable under the normal interpretation.
Regarding the BIT(n)
datatypes, you might suppose that since one has gone to all the trouble to define separate datatypes like BIT(1)
, BIT(2)
, ..., BIT(8)
, that there is some distinction between them. That they behave differently in some way. That there's some reason
for making such a distinction. You would be cruelly mistaken, as I
was. There is literally no difference between these 8
datatypes. Nor is there any distinction between the 8 datatypes BIT(9)
through BIT(16)
. Nor is there any distinction between the 16 datatypes BIT(17)
through BIT(32)
. The distinctions between the BIT(n)
datatypes for n>32
are left as a thought experiment for the reader. Thus we're left with seemingly ridiculous code, like the following from PASS1 of HAL/S-FC:
DECLARE COMPILING BIT(1); ...
IF (COMPILING&"80")^=0 THEN
...
Or in other words, first you declare COMPILING
to have a single bit, and then later you check it to see what the value
of its 8th bit is! This particular bit of hilarity caused me
months worth of trouble.
The storage formats in memory duplicate those that would have been
expected on an IBM System/360 computer, within the limits of my ability
to infer what those formats were. While the storage formats are
not significant in abstract terms, they'll be seen to be quite significant in dealing with certain aspects of HAL/S-FC's source code, such as its so-called "virtual memory" system, and indeed I think it would be impossible to run HAL/S-FC unless these underlying IBM 360 storage formats were used.
A(3)
or B(N)
. Aside: It's easy to become confused and to imagine (incorrectly!) that you can treat aTheCHARACTER
variable (as opposed to an array ofCHARACTER
variables) as an array itself, in order to access its individual characters. You cannot! In fact, the XPL language does not provide any syntactical means to access individual characters of a string. For that, you must rely on built-in functions provided by the runtime library. The most-direct method is to use theBYTE
function, which can either retrieve the EBCDIC numerical encoding of an individual character in aCHARACTER
variable, or else to store a new EBCDIC numerical value at a given position in aCHARACTER
variable. Thus if we had aCHARACTER
variableC
which held the value 'HELLO!', thenBYTE(C, 3)
would return 211 (the EBCDIC encoding for the letter 'L'), while the assignment statement "BYTE(C, 3) = 198;
" would change the contents ofC
to 'HELFO' since 198 is the EBCDIC code for the letter 'F'. That sounds cumbersome, since very few of us have memorized the EBCDIC table, but it's really not. You generally don't have to look up the EBCDIC encoding for anything, because you would actually have programmed operations such as this as "BYTE(C, 3) = BYTE('F');
". Another, less-generally-useful method would be to use the built-inSUBSTR
function to retrieve a specific character position as a newCHARACTER
object of length 1.
BIT(N)
datatype is actually quite problematic. It would be reasonable to assume that since BIT(N)
seems to represent support for collections of bits, then XPL should
provide some syntactical sugar for reading the values of these bits or
modifying them. For BIT(1)
through BIT(32)
, you'd be somewhat correct, in that XPL largely treats these as being interchangeable with integers (i.e., FIXED
), and automatically converts them back and forth between FIXED
. The runtime library's built-in logical-shift-left and logical-shift-right functions (SHL
and SHR
) work just as easily with them as with the FIXED
, as do the logical operators &
, |
, and ¬
(~
), and the relational operators <
, >
, =
, !=
, >=
, and <=
. Swell!BIT(32)
through BIT(2048)
, you are cruelly disappointed. XPL provides no built-in methods of working with this data, beyond the ability to initialize BIT(N)
variables with data when they're declared. (More on that
later.) As far as actually using this data for anything, there are
no built-in means to do anything at all with that data. While you
can manipulate the data by cobbling together various
runtime-library memory-access functions, McKeeman gives none of the
technical information about the storage format that would allow you to
do so, in so far as I was able to discover. What do I mean about cobbling together memory-access functions? Well, suppose that you want to access bit 43 of a BIT(86)
variable called B
. First, you must know where the data for B
is located in memory. There's a function that can tell you
that. Then you must know which byte in that block contains bit
43. There's no function that tells you that, but if you do
know it somehow then there's a function that gives you the value of a
byte at that address. Then you must know which bit in the byte
corresponds to bit 43. Again, there's no function for that, but if
you know it, then you can use library functions like SHL
or SHR
, probably in conjunction with a logical operator like &
,
to isolate the value of that bit. If you wanted to change the
value of that bit, different but similar awfulness is involved.BIT
data is packed into memory. It's hard to write unambiguously
about these matters, but I'll try to do so using the following
concepts: An n-bit value, when written out in human-readable form has a leftmost bit and a rightmost bit; meanwhile, a block of m bytes in memory has a byte that's at the lowest address, a byte at the highest address, and within each byte has a most-significant bit and a least-significant bit. With those ideas in mind, here's how XPL packs BIT
data into memory:BIT(1)
through BIT(32)
: The rightmost bit corresponds to the least-significant bit in the byte at the highest address.BIT(32)
through BIT(2048)
: The leftmost bit corresponds to the most-significant bit in the byte at the lowest address.Aside: Since I say that the bit-packing is undocumented, what's my justification for claiming that what I said just above is true? The short answer is trial-and-error! The longer answer is that one of the intermediate milestones in trying to get XCOM-I to the point of being able to compile the original source code for HAL/S-FC was first to be able to compile McKeeman's original XCOM and run it with a verifiably correct result. But I couldn't get it to work! After messing with it for days on end, I eventually got the answer in a dream, and then experimented with a couple of different bit-packing schemes before finally getting XCOM to run properly. The packing scheme I describe above is the one that worked.
Aside: If all that wasn't bad enough, there's also a trap waiting for you if you're already used to doing bit manipulations with logical operators and shifts in other computer languages. This trap is in the behavior of conditional tests in XPL'sBut enough of these measlyIF
andDO WHILE
statements. Conditional tests in these statements depend only on the least significant bit; i.e., it as if any conditional test involves an extra "& 1
" operation that you can't see. Thus if you wanted to detect (say) that bit 3 of theBIT(5)
variableA
was set, a statement like "IF A & 8 THEN ...;
" wouldn't help you at all, since the implicit "& 1
" in the conditional would cause the test always to fail! You would instead need to use a shift-right operation, such as "IF SHR(A, 3) THEN ...;
".
BIT
-based frustrations!These are pretty self-explanatory in most ways, so I won't dissect them for you in detail. What's perhaps most confusing is that the three arrays declared here (DECLARE F FIXED, C CHARACTER, B BIT(5);
DECLARE FS(10) FIXED, CS(10) CHARACTER, BS(10) BIT(5);
FS
, CS
, and BS
) each have eleven
elements in them, because the number 10 in their declarations is not
the number of elements, but rather the highest legal index. As
mentioned above, indices start at 0, so the total number of elements in
each is 11. Aside: Standard XPL, à la McKeenan, doesn't allow expressions when expressing array sizes, whereas XPL/I does allow them. For example, the following is fine in XPL/I but is a no-no in XPL:
This example will be continued in the next section, where it will make a little more sense, but this is actually the kind of thing you might want to do.DECLARE BUFFER(3600-1) BIT(8);
INITIAL
. This attribute allows you to supply an initial value for the variable, such as:It's important to understand that the initializer has its affect only at compile-time, and is not applied at run-time. That means that if you declare variables withinDECLARE F FIXED INITIAL(22), F2 FIXED INITIAL("22"), F3 FIXED INITIAL("(8) 22");
DECLARE C CHARACTER INITIAL('Hello!');
DECLARE B BIT(5) INITIAL("(1) 10100");
DECLARE FS(10) FIXED INITIAL(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11);
PROCEDURE
s, they're not reinitialized each time the PROCEDURE
is executed.F2
, F3
, and B
,
since they naively appear to be strings instead of numbers; but the
naive interpretation is wrong. Which brings up the nature of literals in XPL:CHARACTER
literals —
i.e., text strings — are enclosed in single-quote (') characters. If a
single-quote itself must appear within the string, you use two
single-quotes in succession. For example: 'I am the ''king'' of the world!'
.FIXED
or BIT(N)
have one of several forms:-10
is not a literal for the number -10, but is instead the minus operator
followed by the literal for the number 10. In most cases this is a
distinction without significance, because XCOM-I (or the original XCOM)
automatically tries to perform all computations that are possible at
compile time. Nevertheless, this distinction does cause some
arithmetically-satisfactory expressions to be syntactically illegal in
XPL. For example, the expression 5 + -5
isn't legal in XPL.LITERALLY
. It's not strictly related to declaration of variables, even though appearing in DECLARE
statements and so it's discussed in the next section instead.ARRAY
is used in place of the keyword DECLARE
, and in that it can only be used for arrays of FIXED
or BIT
, and not for CHARACTER
.Aside: For the original XPL/I compiler, I believe there was a distinction in the wayMore significantly, XPL/I adds an additional kind of datatype that it calls aARRAY
variables were stored in memory vsDECLARE
variables. At the present time, I don't see this distinction as being operationally significant, so XCOM-I treats the two keywords identically. This is subject to change, if I discover my thinking was in error. As, unfortunately, I often do.
BASED
variable. These are basically pointers to arrays of FIXED
, BIT
, or CHARACTER
. Note that I said "pointers to arrays" rather than "arrays of pointers". By changing the address stored in the BASED
variable's pointer, you can instantly interpret an entirely different
chunk of memory as the array. Moreover, besides the basic types
just mentioned, the BASED
variable can point to an array of "records", where each "record" is a collection of the basic
datatypes. I.e., a record can hold any combination of FIXED
, BIT
, or CHARACTER
fields, or arrays thereof. Using the BASED
mechanism, XPL/I can thus mimic both pointers and primitive types of
structures, neither of which is available in XPL proper. I say
that the structures are "primitive", because BASED
variables cannot themselves be fields of BASED
variables, hence only structures that are a single-level deep are available. BASED
variables:AlthoughBASED FB FIXED;
BASED RB RECORD:
F FIXED,
C CHARACTER,
A(10) BIT(5),
END;
BASED
variables are always (or almost always) arrays, you'll note that the declarations of FB
and RB
don't indicate any dimensional information. That's because no
memory for them, other than for the pointer, is allocated at
compile-time. Space is instead explicitly allocated at runtime by
user code. Thus XCOM-I has knowledge of the size of each array entry, but not of the number of elements in the array. When I said that aUser code that initially allocates free memory — let's say 25 records to start with — for aBASED
is a "pointer to an array", I was glossing over the fact that to be useful aBASED
must track a lot more information about theBASED
than just its data's location in memory. In fact, aBASED
is stored as a 28-byte structure plus the separately-positioned data for the array. The 28-byte structure is referred to as a "dope vector". In other words, if you had aBASED
called (say)MYBASED
and you executed the built-in functionADDR(MYBASED)
, it would return the address of the dope vector forMYBASED
. As usual, the HAL/S-FC documentation and source code do not actually provide any useful facts about this setup, but various factoids can be inferred from HAL/S-FC source code, to a greater or lesser degree of confidence, and here are my feeble inferences about the fields of dope vectors. The fields with green highlighting are those of conceivable interest to an application programmer, though all of them are managed entirely transparently in most cases.
- Offset 0:
FIXED
pointer giving the address of the actual data.- Offset 4:
BIT(16)
giving the size in bytes of each record.- Offset 6:
BIT(16)
gives the number ofCHARACTER
orBIT(≥32)
fields in eachBASED RECORD
, or 1 if it's aBASED CHARACTER
orBASED
BIT(≥32)
, or 0 otherwise. The point is that it's the number of "string descriptors" associated with each element of theBASED
array. This information is used byCOMPACTIFY
. (See below.) While I don't fully understand the calculations being performed, I'd venture the opinion that forCOMPACTIFY
to use this information efficiently, it's necessary for the XPL/I compiler to rearrange the fields ofBASED RECORD
from their declared order in such a way that all of string-descriptor fields come first in the record.- Offset 8:
FIXED
giving the total number of array entries for which space has been allocated.- Offset 12:
FIXED
giving the total number of array entries actually used so far.- Offset 16:
FIXED
. The dope vectors for allBASED
variables for which memory has been allocated form a linked list. The global variableFIRSTRECORD
gives the address of the first dope vector on the list, and this field at offset 16 in the dope vector points to the next dope vector, or 0 if there is no next one (or if space hasn't been allocated). The ordering is TBD, since I haven't traced through the code in all of its gory detail, but I believe they are ordered according to decreasing address fields (i.e., the field at offset 0), which at least initially is also the order in which theRECORD_CONST
orALLOCATE_SPACE
macros (see below) were executed to allocate memory for them.
- Offset 20:
FIXED
. It appears to me that this field supplies some properties of theBASED
in the form of bit fields. It is laid out as follows:
- Bit 24 indicates the
BASED
is "constant", which appears to mean that you cannot incrementally grow it. (The macroNEXT_ELEMENT(based)
is used to increase the size of the specifiedbased
by 1 record, an operation which fails if thebased
is "constant".)- Bit 25 indicates the
BASED
is "unmoveable". If abased
is "unmovable", it means that an operation likeNEXT_ELEMENT(based)
(see above) will succeed only if there is enough free space immediately following the allocated memory that can be "stolen". Whereas if it's not unmovable, then thebased
may migrate in its entirety to a newly-allocated block elsewhere and the space it originally occupied may thus be freed.- Offset 24:
BIT(16)
of purpose TBD. It is referred to as "global factor".- Offset 26:
BIT(16)
is referred to as "group factor". As far as I can see, all uses of this are commented out in HAL/S-FC, so perhaps it ended up being unused.
The XCOM-I implementation mimics this dope-vector structure, though only the fields I've highlighted in green are significant in XCOM-I ... which is fortunate, since they're the only ones I imagine I understand somewhat.
With that discussion in mind, in understanding some of the things that need to happen withBASED
variables in actual XPL/I source code (and in particular, in HAL/S-FC), let's consider various space-management macros and/or procedures used:
RECORD_ALLOC(based)
, used in expressions, returns the number of records allocated inBASED
variablebased
.RECORD_USED(based)
, normally used on the left-hand-side in assignments, sets the number of records used so far in based. Its most-common usage isRECORD_USED(based)=RECORD_ALLOC(based)
, but it can also be used with something other than that on the right-hand side to truncate the array or to skip past the lowest indexes. And it can be used in expressions or conditionals, though that happens relatively seldom.RECORD_TOP(based)
, as you might expect, simply returnsRECORD_USED(based)-1
.ALLOCATE_SPACE(based, top)
allocates enough space forbased
to insure that it contains at leasttop+1
records in total. It will fail if we already haveRECORD_ALLOC(based)>0
.NEXT_ELEMENT(based)
incrementsRECORD_USED(based)
by 1, stealing the space from adjacent free memory or else reallocating and moving the entire array if necessary to do so. This can only be used if space forbased
had been allocated byALLOCATE_SPACE
, and will instead cause an abend ifbased
had been allocated byRECORD_CONSTANT
(see below).RECORD_FREE(based)
frees the data forbased
, returning the allocated space to the free pool.RECORD_SEAL(based)
,RECORD_UNSEAL(based)
: Enables or disables the "constant" attribute of thebased
.RECORD_CONSTANT(based, top, moveable)
LikeALLOCATE_SPACE(based, top)
, but additionally enables the "constant" property, and optionally enables the "unmovable" property.RECORD_WIDTH(based)
returns the record width of thebased
.RECORD_LINK()
prepares the data for transferringCOMMON
memory to the next program loaded.
BASED
variable is typically a two-step process that looks something like this:Or, if you knew that you were going to need more elements later, you might allocate a bit extra, for example:BASED MYVARIABLE FIXED;
...
RECORD_CONSTANT(MYVARIABLE, 25, MOVEABLE); /* OR UNMOVEABLE */
RECORD_USED(MYVARIABLE) = RECORD_ALLOC(MYVARIABLE);
Having allocated the space for it, you can now useBASED MYBASED FIXED;
...
ALLOCATE_SPACE(MYBASED, 30);
RECORD_USED(MYBASED) = 25;
MYBASED
just like any other array of FIXED
, such as in assignments like "MYBASED
(27) = 6;
" or "X =
MYBASED
(N) + 12;
". This will incrementNEXT_RECORD(MYBASED);
RECORD_USED(MYBASED)
by 1 — if possible while still keeping it below RECORD_ALLOC(MYBASED)
— or else will reallocate and possibly move MYBASED
into a larger space, if possible.BASED RECORD
variable:AccessingBASED MYNEWBASED RECORD:
F FIXED,
C CHARACTER,
A(9) FIXED,
END;
...
RECORD_CONSTANT(MYNEWBASED, 30, MOVEABLE);
RECORD_USED(MYNEWBASED) = 25;
MYNEWBASE
requires the dotted style often used these days for
accessing fields of structures or classes. Some examples include:and so on.MYNEWBASED(6).F = 12;
MYNEWBASED(10).C = 'XPL is where it is at!';
MYNEWBASED(20).A(6) = 15;
X = MYNEWBASED(6).F;
There is one — count 'em, one! — exception I've found to the documented behavior I've described above. This undocumented use is seen in theIND_STACK
variable found in PASS2 of HAL/S-FC.IND_STACK
is declared using the following bastardized mash-up of aDECLARE
statement and aBASED
statement:
Unlike aDECLARE IND_STACK(STACK_SIZE) RECORD:
I_CONST FIXED,
I_INX_CON FIXED,
I_STRUCT_CON FIXED,
...
I_LIVREMT BIT(8),
I_NAMEVAR BIT(8),
I_STRUCT_WALK BIT(8),
I_AIADONE BIT(8),
END;BASED
declaration, in which specifying an array size at compile time is illegal, or as IR-182-1 states, "ignored if present", the array size is indeed found in this declaration. Nor is there any runtime adjustment to the size, as would be expected with aBASED
. Nonetheless,IND_STACK
is subsequently accessed by the dotted, structure-type notation used only byBASED
variables.
Rather than implement an entirely new but undocumented class of structure objects to handle this single instance, XCOM-I implements this case as if it were aBASED
declaration followed by an appropriateRECORD_CONSTANT
operation.IND_STACK
is one of the very few objects stored in region 7 of the XPL memory model, and by far the largest of them.
DECLARE
statements is the LITERALLY
attribute. Here's an example:Notice thatDECLARE ARRAYTOP LITERALLY '255';
DECLARE MYARRAY(ARRAYTOP) FIXED;
ARRAYTOP
has no
datatype assigned to it. That's because its declaration is not
actually the declaration of a variable called "ARRAYTOP", but rather of a
macro of that name. Wherever the identifier ARRAYTOP
is encountered subsequently, it's simply replaced literally by the string 255
, now unquoted:This clarifies an example of a commonly-desirable declaration I gave in the preceding section, which in this section would be expressed as:DECLARE MYARRAY(255) FIXED;
As noted in the preceding section, standard XPL's grammar wouldn't allow an expression (likeDECLARE RECSIZE LITERALLY '3600';
DECLARE BUFFER(RECSIZE-1) BIT(8);
RECSIZE-1
) in that context, so this particular convenience in making declarations is only available in XPL/I.This expands toDECLARE DEVICE LITERALLY '6', DECLARE OUT LITERALLY 'OUTPUT(DEVICE)';
OUT = 'My message';
Macros can expand to portions of statements, as the ones above have, or to multiple statements, such asOUTPUT(6) = 'My message';
which expands to:DECLARE MYBLOCK LITERALLY 'DO; X=1; Y=X+3; END';
...
IF X=7 THEN;
MYBLOCK;
Macros can also have arguments. Consider the following:IF X=7 THEN;
DO; X=1; Y=X+3; END;
This declaration means thatDECLARE MYMAC(2) LITERALLY '%1% = %2%';
MYMAC
has 2 arguments, and that when the macro is expanded, the first argument will replace %1%
and the second argument will replace %2%
. Thus "MYMAC(X, 3 * Y)
" expands to "X = 3 * Y
".Warning: As with macros in any other computer language, this can quickly get out of hand. XCOM-I, for example, won't detect recursive, endlessly-expanding macros. There's also no guarantee when multiple macros are in play that XCOM-I will necessarily expand macros in the same order that XCOM would have. Neither McKeeman nor Intermetrics documentation makes any mention of what that ordering should be.
Warning: The scope of macro definitions is also different in XPL vs XPL/I. In XPL, macro definitions don't respect any nested scopes they appear in; i.e., any macro definition will simply remain in effect until the end of the source code. In XPL/I, macro definitions remain in effect only until the end of the procedure in which they're defined, including embedded procedures. In neither case does a macro definition have any effect on source code prior to it.
XPL allows multiple variables to be assigned the same value in a single assignment statement, with a syntax like:
X1, X2, ..., XN = Y;
According to McKeeman (p. 137), these assignments are performed in right-to-left order. In other words, it should be equivalent to:
XN = Y;
.
.
.
X1 = Y;
In most cases, the ordering of these assignments is of no significance, and in fact I've found no legacy standard XPL programs in which the ordering matters. However, there are instances in XPL/I code, specifically in PASS1 of HAL/S-FC in which the order matters a lot. That's when you have assignments of the form:
Y(I), I = J;
because obviously
I = J;
Y(I) = J;
is a lot different than
Y(I) = J;
I = J;
Unfortunately, in the cases I've encountered, the XPL/I code
seems to rely on the latter interpretation. I.e., it seems to
believe that the assignments are performed in left-to-right order, in
direct contradiction to McKeeman. What's going on here?
Of course, we have no way of knowing what Intermetrics XCOM did with this, but as far as A Compiler Generator's
XCOM, it appears (thanks to Dan Weaver for this explanation) that while
the assignments are indeed performed in a right-to-left manner as
documented, the peculiarities of the IBM 360 object-code generation in
McKeeman XCOM, shove the index I
into a CPU register and reuse it without change throughout both
assignments. So seemingly by accident, the net result is that the
assignments appear to have been done in a left-to-right order.
I can only regard this a bug in McKeeman XCOM. Plus, it's not
reliable: what if there are more than 2 assignments on the
left-hand side, and if they don't all use the same index I
? What would the order appear to be then?
As a result of all that:
Important Note: XCOM-I ignores the statement in A Compiler Generator that assignments are performed in a right-to-left manner. Rather, they are performed in a left-to-right manner, and any indices of arrays are computed at the moment the assignment is performed.
XPL's logical operators are &
, |, and ~
(¬
, ^
), for "and", "or", and "not", respectively. The documentation in A Compiler Generator
is maddeningly unclear as to what these operators do. True, table
6.8.1 calls them "logical and", "logical or", and "logical complement",
but the word "logical" isn't defined ... just as my sloppy usage of the
word "logical" at the beginning of this paragraph makes no distinction.
Which leaves open a few loopholes that have to be closed up. The issues
which we must understand are:
expression1&expression2
, and expression1
evaluates to 0, is expression2
even evaluated?In case you're not in the mood for a technical discussion of the
matter, I'll give you the short answer up front, and having read that,
you can proceed to the long discussion if you want:
As for how I came up with this "information", there are several
places we can look for guidance in guessing the answers. For one thing,
according to McKeeman's account, the XPL language was derived from the
PL/I language, so we can look at PL/I documentation and hope that it
applies to XPL. Of course, we can look at the source code for
McKeeman's XPL compiler (XCOM) as listed in the book, and see if there are any hints there. Or we can even examine the IBM 360 object code that XCOM
generates for these operators. (All the while wondering how things
came to this, that we have to resort to lame measures like consulting
object code to figure out the basic features of the language?)
As far as PL/I is concerned, IBM's PL/I Language Reference (2017) tells us on p. 66 that for the &
, |
, and ¬
operators, "bit operations are performed on a bit-by-bit basis". As far as object code produced by XCOM is concerned, McKeeman (p. 150) shows an example in which object code for the expression "SHL(K,1) & SHR(I,J)
" is produced, and we do find that it simply uses the IBM 360 NR
("And Logical") instruction:
Although I had to consult more than one IBM assembly-language manual
to find the answer to the seemingly-simple question of what NR
does, IBM's z/Architecture Principles of Operation (p. A-8) does tell us that the NR
(and its cousins N
, NC
, and NI
) are indeed bitwise operations.
Aside: Figure B-2, "Instructions Arranged by Mnemonic" of the latter document is very helpful in trying to decipher such listings of IBM 360 object code.
Short-circuiting is a natural consideration for strictly bipolar
operands and operators, but is a bit trickier to consider once we've
concluded that the logical operators operate bitwise rather than in a
bipolar fashion. Certainly the object-code example from A Compiler Generator that was mentioned in the preceding paragraph shows no signs at all of short circuiting: Both of the operands of the &
operator in that example are evaluated, with no attempt at checking the
value of the first operation before proceeding to the second one. On
the other hand, that example of object-code generation by XCOM happens to be for an assignment statement rather than for the conditional expression of an IF
, DO WHILE
, or DO UNTIL
.
Perhaps the evaluation of a conditional expression might be very
different in those contexts. One reason to believe that it might be
different is that the final result of a conditional expression is masked
to just the least-significant bit, and thus (eventually) is indeed a
bipolar value; i.e., even if all of the bits were involved in the
computation, all but one of them is discarded in the end, so perhaps the
extra bits are discarded at the beginning rather than at the end of the
computation, even though it's more work to do so. Moreover, the PL/I
Language Reference document mentioned earlier does cover short-circuit
evaluation (see p. 245), and it says that short-circuiting is only in the context of the conditional of an IF
statement (versus assignment statements). Plus, even then the
short-circuiting occurs only in certain special circumstances, such as
the leading operand being a BIT(1)
literal or constant variable, which leads one to believe that the value
of the leading operand has to be determined to be 0 or 1 at
compile-time rather than at run-time for the short-circuiting to occur.
Unfortunately, the example of object-code generation in McKeeman doesn't show us how an IF
statement would compile. But as we saw earlier, we have been able to use XCOM-I to create a working copy of McKeeman's XCOM, so we can make our own example of IF
, compile it with XCOM and see! Imagine we have the following ridiculous little XPL program:
DECLARE I FIXED;
DO I = 1 TO 10;
IF (I * I) & (100 - I * I) THEN OUTPUT = 'hello';
END;
DO I = 1 TO 10;
IF 0 & (100 - I * I) THEN OUTPUT = 'hello';
END;
EOF
Compiling this silly program with XCOM, and pulling just the relevant portion of the XCOM's report gives us the following IBM 360 object code for the conditional expressions of theAside: By the way, compiling an XPL program with McKeeman's XCOM is a bit more constrained than compiling a similar program with XCOM-I. For one thing, to avoid an irritating if harmless warning message, the
EOF
token must be present at the end of the source code, whereas XCOM-I doesn't care. For another, even though XPL is case-insensitive other than inside of quoted strings, and even though all of the XPL source code in A Compiler Generator is printed in lower case, XCOM will in fact choke on any XPL source code that isn't fully upper case. Go figure!
IF
statements:20 | IF (I * I) & (100 - I * I) THEN OUTPUT = 'hello'; | 1314 C7 = 10.
1314: CODE = L 1,1340(0,11)
1318: CODE = M 0,1340(0,11)
1322: CODE = L 2,1340(0,11)
1326: CODE = L 3,1340(0,11)
1330: CODE = MR 2,2
1332: CODE = LA 2,100(0,0)
1336: CODE = SR 2,3
1338: CODE = NR 1,2
1340: CODE = N 1,164(0,11)
...
24 | IF 0 & (100 - I * I) THEN OUTPUT = 'hello'; | 1400
1400: CODE = L 1,1340(0,11)
1404: CODE = M 0,1340(0,11)
1408: CODE = LA 2,100(0,0)
1412: CODE = SR 2,1
1414: CODE = N 2,1300(0,1048571)
1418: CODE = N 2,164(0,11)
...
Not shown above is that the symbol table tells us variable I
is stored at address 1340(11), which is why all of the 1340(0,11)'s
appear above. I don't understand IBM 360 assembly language, but what I think the code probably does is:
IF
:&
operator; i.e., I*I
.&
operator; i.e., 100-I**
. IF
:&
operator; i.e., I*I
.But whether or not my interpretation is 100% correct, at least in
this example there's no evidence of short-circuiting. The 2nd IF
in particular is pretty shocking. Perhaps there's supposed to be some
subsequent optimization I'm not aware of that would have cleaned it up.
COMMON
memory, not present in XPL. The notion behind COMMON
memory is that a very large application program like HAL/S-FC won't be
loaded entirely in memory at once, but will instead be run as a sequence
of "passes". COMMON
memory is not supported in standard XPL, and compiling a program with XCOM-I's --xpl
switch will disable the COMMON
memory features described in this sections.COMMON
memory. It's formalized when you explicitly declare variables as being in COMMON
. Variables declared to be in COMMON
are not initialized by an XPL/I program, except for the very first program in a chain of programs, but are simply assumed to
already contain the data needed. On the other hand, variables not declared COMMON
are up for grabs, and no assumption can be made about their initial
contents other than whatever initialization their declarations
explicitly provide.COMMON
data is declared in XPL/I by three methods:COMMON
in place of the keyword DECLARE
.COMMON ARRAY
in place of the keyword ARRAY
.COMMON BASED
in place of the keyword BASED
.Another distinction is that CHARACTER
variables cannot declared in COMMON
, though CHARACTER
variables can appear as fields in COMMON BASED RECORD
variables. XCOM-I actually relaxes this restriction.
It's not documented anywhere, as far as I know, but I would assume that there was originally an expectation that each cooperating application running in succession needed to declare COMMON
in exactly the same way, using exactly the same ordering of variables and the same datatypes. XCOM-I relaxes this restriction to a certain extent, but the data of BASED variables always resides at the same memory addresses in adjacent programs of a chain. When "dope vectors" of BASED variables were discussed earlier, it was mentioned that BASED
variables could be set as "constant" or "unmoveable", and this
necessity for remaining at the same location when successor programs are
executed is part of the reason for this feature.
Aside: If BASED variables were always manipulated by XPL/I program as intended by the original compiler design — i.e., allowing SPACELIB to manipulate them via their dope vectors — there would be no reason for these restrictions on moveability. However, the "virtual memory" system employed by HAL/S-FC, as implemented by the files HALINCL/VMEMx.xpl, bypasses the dope-vector system. Specifically, "pages" of memory managed by the virtual-memory system are tracked only by an array of addresses (rather than dope vectors). But SPACELIB manages dope vectors, and has no cognizance of arrays of addresses whose interpretation exists only in the mind of a programmer. It is therefore necessary to make sure that the virtual-memory system's pages of memory never move. One might ask the programmer why they didn't use arrays of dope vectors rather than arrays of addresses? But that might be interpreted as being rude. On the other hand, I've known enough engineers to be pretty sure they'd think that since it worked for them, that was good enough.
Aside: Speaking of the relationship between
BASED
variables andCOMMON
memory, it's important to note the following points:
- For
BASED
variables appearing inCOMMON
, space is allocated for them viaRECORD_CONSTANT
orALLOCATE_SPACE
only in the first of the chained programs that needs to use them. Subsequent programs in the chain use them as-is, without allocation.- Memory for
BASED
variables not appearing inCOMMON
must be explicitly freed viaRECORD_FREE
prior to exiting whatever program of the chain allocated their memory. If this is not done, then the memory-management system (Intermetrics SPACELIB) will abort the program with an error and fail to prepare theCOMMON
data for use by the next program in the chain.
Of course, XCOM-I makes no effort at all to pass COMMON
data from one application to another using actual memory. Rather, each XPL/I application program compiled by XCOM-I can optionally (depending on its command-line options) load a file containing data into its COMMON
areas of memory; and similarly, it automatically writes out its COMMON
areas of memory into a file upon termination. By using the --commoni
and --commono
command-line switches of the application, a close degree of control can be exercised over which previously-saved COMMON
blocks, if any, are passed to which application programs. The command-line options are needed because by default, an executable produced by XCOM-I does not read in a COMMON
file at startup, and outputs a file literally called "COMMON.out" upon
termination. The command-line options override either or both of
those defaults.
COMMON
files are in a human-readable format. See the comments for the writeEntryCommon
function in the runtime-library file runtimeC.c if you have any
interest in the file format. It's actually pretty useful for
debugging certain kinds of problems.
The theoretical memory space available at runtime for a compiled XPL program is 224=16,777,216
bytes in size, although the Wikipedia article on IBM System/360 tells
us that the actual physical maximum was only 8 MB. In the C object
code generated by XCOM-I, this is represented by the byte array called
For the original XPL and XPL/I compilers, the lowest portion and highest
portion of this (theoretical) 16 MB space was dedicated to the
executable code for the program and the data used by it. The
middle of the area was used for the program's data. But in the XCOM-I
framework, all of the program code is stored elsewhere, thus the entire
16MB space can be dedicated just for the variables actually The memory nevertheless still needs to be partitioned into blocks
dedicated to specific types of data, in order to facilitate management
of dynamic data like Regions 4 and 6 naturally did not exist in XPL, since Mostly, the boundaries of these memory regions are established at compile-time, by which I mean they're established by XCOM-I
and don't change thereafter. The exceptions are the boundaries between
regions 5, 6, and 7. As you may recall, at program startup, Aside: It would seem reasonable, at least to me, that ifRegion 7 is a bit trickier to explain. It is an area dedicated to data which XCOM-I wants to use for XPL objects it creates at compile time that need to be accessible to the XPL program as normal XPL variables, but that need to be exempt from the normal memory-management procedures applied to regions 5 and 6. These are items which the original XPL compilers would have stored in regions 1 through 6, but that the different implementation of XCOM-I would have made difficult. XCOM-I allocates this area and the objects in it at compile-time. Though shown as beginning at 0xFFE000 , that's misleading. The 0xFFE000 limit is the default, and is appropriate for compiling HAL/S-FC, but in fact is controlled by XCOM-I's --reserved=N
(default 8192) command-line option. There will be compile-time
messages if the size of the reserved memory needs to be increased.
It's important to note that all XPL programs sharing the same COMMON memory need to have the same size for region 7. Specifically, all passes of HAL/S-FC need to have the same size of reserved memory, and that's why this particular default value for it was chosen.Aside: In fact, the use of reserved memory has proven to be extremely rare. The only instances of use, to date, have been in PASS2 and PASS4 of HAL/S-FC, each of which stores a single object declared in a previously-unencountered, idiosyncratic way. |
|
At runtime, changes to sizes of BASED
arrays may cause those arrays to move around within memory region 6
(see the preceding section). Depending on the type of changes, this can
cause "holes" of unused memory to develop in memory region 6.
Similarly, operations on CHARACTER
variables such as INPUT
or string concatenation (||
) can cause holes of unused memory to develop in memory region 5. As long as FREEPOINT
is comfortably less than FREELIMIT
,
these holes don't cause any problem and can just be ignored. However,
it may eventually become necessary to repack these memory regions to
consolidate the free space and eliminate the holes.
As far as memory region 6 is concerned, that's handled transparently by the tools already discussed in the section on BASED
variables earlier, and won't be discussed here. The COMPACTIFY
procedure which handles this for memory region 5, unfortunately, does
require some clarification, even though you typically don't need to
explicitly call COMPACTIFY
yourself, and can assume it will just be called automatically whenever needed.
Most importantly, while COMPACTIFY
doesn't handle memory management of BASED
variables, the existence of based variables does affect how COMPACTIFY
operates, with the result that different versions of COMPACTIFY
must be used for XPL code than for XPL/I code. But there's a quirk in XPL's implementation: While COMPACTIFY
is considered a "built-in" function of the XPL runtime library, in point of fact it's not present in the XPL (or XPL/I) runtime library, and must instead be explicitly provided in the form of XPL source code.
Fortunately, that doesn't entail any effort on your part, since XCOM-I
can usually figure out what to do on its own. But still, it may be
helpful to understand what's going one behind the scenes, for those
cases in which you need to intervene. The XCOM-I approach to the inclusion of COMPACTIFY
is that prior to loading any of the XPL or XPL/I source-code files you
explicitly specify, it automatically preloads a "library file", which is
an XPL or XPL/I source-code file containing at least the source code for COMPACTIFY
. The library file it chooses is governed by the following rules:
--xpl
is used, the default library changes to XPL.LIBRARY.xpl. This a
duplicate of the library of the same name provided with the source code
of McKeeman's XCOM program. It should be good for compiling all standard XPL source code.--lib-file=FILENAME
is used, then FILENAME
is used in place of the default library file. This gives you the option of using some other version of COMPACTIFY
, perhaps experimenting with it yourself. If used along with an --xpl
switch, then --lib-file
must follow --xpl
on the command line.COMPACTIFY
of XPL.LIBRARY.xpl works. All of the "string descriptors"
(providing the addresses and sizes of the character strings in region 5
which need to be repacked) are found, contiguously, in memory region 3,
which they completely fill. Because of this convenient
arrangement, all of the string descriptors can all be accessed as if
they were a single FIXED
array. Standard XPL's built-in DESCRIPTOR
and NDESCRIPT
respectively give the location and size of this array of string
descriptors. Repacking region 5 is a relatively-simple matter of
sorting DESCRIPTOR
array by
address, using the sorted array to find the "holes", then moving the
string data downward in region 5 to eliminate the holes, and finally
updating the original descriptors in the DESCRIPTOR
array to preserve the strings but to change the addresses.DESCRIPTOR
array can't be sorted in place. Rather, a separate scratch array is needed to help out the process:ThisDECLARE DX_SIZE LITERALLY '500', DX(DX_SIZE) BIT(16);
DX
array is set up to contain the indices into the DESCRIPTOR
array, and it's DX
that's sorted rather than DESCRIPTOR
. This is undoubtedly a good strategy, although it limits the total number of DECLARE
'd CHARACTER
variables to 500. On the other hand, the limit of 500 is
completely arbitrary, and if you have enough memory (we do!) there's no
reason it couldn't be increased as high as 64K. And by the simple
expedient of declaring DX
as FIXED
rather than BIT(16)
, the limit could be made much higher.BASED
variables containing CHARACTER
data or BIT(≥32)
data — because no longer are all of the string descriptors nicely lined
up for us in memory region 3. Rather, many of them are smeared
all over memory region 6. Nor is a limit of 500 strings remotely
adequate for compiling programs like HAL/S-FC. Here is what I understand of the differences between how the XPL/I COMPACTIFY
has to operate, as compared to the description I just gave of the XPL COMPACTIFY
:DESCRIPTOR
and NDESCRIPT
still exist, and still provide exactly the same info about memory region 3, they are no longer relevant to COMPACTIFY
.DX
is now a BASED FIXED
,
because it contains pointers to the string descriptors (i.e. it
contains memory addresses of the string descriptors) rather than just
indices into region 3.DX_SIZE
gives the number of elements of DX
, but as it's nowhere DECLARE
'd in the library, I suppose it must be an undocumented XPL/I built-in.Aside: I'd venture the guess that
DX_SIZE
is initially set and space forDX
is initially allocate by the XPL/I compiler to handle the string descriptors which areDECLARE
'd and therefore known at compile time. At runtime, SPACELIB would then take over the task of maintainingDX
andDX_SIZE
to handle additional string descriptors that come into existence (or leave it) whenBASED
variables containing them grow (or shrink).
An XPL program consists of any sequence of XPL statements, followed by the token EOF
. In particular:
PROCEDURE
s. Note: The original XPL compiler, which was called XCOM, performed a single pass. It required that the declaration of any particular identifier as an object (such as a variable) had to precede the use of that identifier, although there were provisions for making a forward declaration for a
PROCEDURE
, so that thePROCEDURE
could be used before it was defined. XCOM-I relaxes this requirement.
Each of the sample programs I've encountered in standard XPL so far has been contained in a single relatively-small file. For example, ANALYZER has a little over 1500 lines of source code, while XCOM has a little over 4200 lines.
In contrast, the XPL/I source code for Intermetrics's HAL/S compiler HAL/S-FC
has over 120,000 lines of source code spread across over 600 files,
though any individual pass of the compiler has no more than around
35,000 lines. This huge size, along with the huge difference from
programs in standard XPL, necessitates different methods for managing
that source-code base, and some of those methods are reflected by
compiler directives embedded within the source code. Insofar as HAL/S-FC
and its related applications are concerned, the top-level source-code
file (##DRIVER.xpl) for each application always contains all of the
necessary directives for compiling the other source-code files needed,
in the correct order, so in using XCOM-I to compile these applications
you don't need to worry about any file other than ##DRIVER.xpl itself.
Compiler directives in XPL/I are comments or comment-like constructions which aren't documented in McKeeman and have no other obvious purpose. The specifics are covered by the subsections below.Aside: Well, the comment about ##DRIVER.xpl isn't exactly right. Any XPL or XPL/I program will expect that there's a separate "library file" containing source code for the
COMPACTIFY
procedure, but the XPL/I source code for the program won't explicitly include the library file. That's the compiler's responsibility.Aside: Due to the lack of relevant Intermetrics documentation, what I'm about to describe is not only speculative on my part, but also represents certain pragmatic compromises that I don't believe literally existed in Intermetrics' XPL compiler or development procedures. But if it will work for us using XCOM-I on HAL/S-FC and if there are no other lurking XPL/I programs that we need to worry about, why complain?
/?c ... XPL/I source code ... ?/
#if c
. Here, c
is supposed to be an upper-case alphabetical letter that represents the
particular condition that needs to be "true". All conditions, A
through Z
, are by default "false". You make condition c
"true" by using XCOM-I's --cond=c
command-line switch.c
that I'm aware of:XCOM-I Command-Line Switch |
Interpretation |
---|---|
--cond=P |
HAL/S-FC will be specialized for compiling the Space Shuttle's Primary Flight Software (PFS). |
--cond=B |
HAL/S-FC will be specialized for compiling the Space Shuttle's Backup Flight Software (BFS). Note: Either --cond=P or --cond=B must be used, but not both at the same time. |
--cond=A |
Produce debugging output related to memory management of BASED variables. |
--cond=C |
Produces debugging output related to actions by the COMPACTIFY procedure. |
Aside: This implies that you don't just compile HAL/S-FC once to get a HAL/S compiler that works for all HAL/S programs. Rather, you compile HAL/S-FC twice, once to get a version of the HAL/S compiler that works for the primary flight software, and once to get a version of the compiler that works for the backup flight software.
/%INCLUDE module %/
COMMON
-block
declarations or macros which are used identically by all source-code
files. By default, the included module is taken from the folder
../HALINCL/. As far as I know this covers every use in HAL/S-FC source code. However, if necessary, XCOM-I has a command-line option (--include=folder
) which can be used to change the folder containing the modules./* ...comment... $%module */
/%INCLUDE module ...comment... %/
directive, which acts the same way, and for which my comments are otherwise the same./**MERGE module procedure */
PROCEDURE
. As above, module.xpl is the source-code file to include, while procedure is the name of the PROCEDURE
contained in that file. In point of fact, XCOM-I simply ignores the procedure name.Aside: Procedure names don't match the filenames, usually, because the naming conventions for System/360 files were severely limited vs identifiers in XPL. Thus the filenames were normalized, truncated forms of the procedure names.Also, module.xpl is expected to be within the same folder as the source-code file being compiled; no other folders are searched for it, and there are no command-line switches to alter this behavior.
In a view from a height, an XPL program consists of PROCEDURE
definitions and of code that uses those definitions.
A procedure definition looks something like this:
label:
PROCEDURE(... parameter list ...) ReturnType;
DECLARE ... for the parameters ...;
DECLARE ... for local variables ...;
... code ...;
END label;
A lot of this is optional. Thus while the initial label:
is required (since it's the name of the procedure), the label
at the end of the definition is optional, and is really there only for
readability purposes. If the procedure needs no parameters, then the parameter list
, including its enclosing parentheses, is omitted. If the procedure returns no value, then ReturnType
is omitted; if present, it is one of the basic non-subscripted datatypes FIXED
, BIT(n)
, or CHARACTER
. Each parameter in the parameter list
must have a declaration within the body of the procedure, and while
those declarations don't technically have to precede the declarations of
the local variables as shown above, it was apparently customary to do
so.
PROCEDURE
definitions in the source code can be nested, to any desired depth. PROCEDURE
definitions and DO...END
blocks provide the program with a hierarchical structure of
"scopes". The hierarchical structure is provided by the
parent/child relationships among the scopes. Variables are accessible
within the scope in which they're declared, along with any descendant scopes. If a
variable is declared in more than one scope of the hierarchy, the one
in the innermost enclosing scope is the one that is applicable. PROCEDURE
, and not within DO...END
blocks. PROCEDURE
) definitions can be declared anywhere, even within DO...END
blocks. XCOM-I mimics the XPL/I usage, since it's compatible with standard XPL. But there's a trap! Or at least, it's a trap for me, and by projection, I assume it's a trap for you as well. The trap is that you'd suppose the scope of a variable or PROCEDURE
was the DO...END
block in which it was declared. Not so! The scope is the entire enclosing PROCEDURE
and its descendants. So beware. Fortunately, such
abominations are very rare, and only a handful appear in the entire HAL/S-FC source-code base. Here's an example from the DUMP_SDF PROCEDURE
of PASS4 of the compiler, cleaned up a little for readability. Observe the declaration and usage of the variable SYM_DATA_CELL_ADDR
within the two distinct DO...END
blocks:AIF NODE_B(3) > 0 THEN DO;
/* IN CASE OF SYMBOL XREF EXTENSION CELL, SAVE ADDR OF SYM DATA CELL */
DECLARE SYM_DATA_CELL_ADDR FIXED;
SYM_DATA_CELL_ADDR= COREWORD(ADDR(NODE_F));
CALL PRINT_XREF_DATA(SHR(NODE_B(3),1));
END;
IF ASIP_FLAG & (SCLASS=1 | SCLASS=2 & NAME_FLAG) THEN DO;
/* RESTORE NODE_F TO SYM_DATA_CELL_ADDR BEFORE GETTING AUXILIARY INFO. */
COREWORD(ADDR(NODE_F)) = SYM_DATA_CELL_ADDR;
IF STYPE=16 & ^NAME_FLAG THEN
CALL FORMAT_NAME_TERM_CELLS(I,NODE_F(-2));
ELSE OUTPUT = X10||'INITIAL(NAME('||
FORMAT_VAR_REF_CELL(NODE_F(-2)) || '))';
END;
PROCEDURE
may be invoked in two different ways. If it returns a value via a RETURN
statement, it can be used in an arithmetical expression or a string
expression. If it doesn't return a value, or if it does return a
value and you simply want to ignore the value, a CALL
statement can be used to invoke the PROCEDURE
but to discard any returned value.Important: All variables local to a PROCEDURE
definition retain their values after the PROCEDURE
returns. If the PROCEDURE
is re-executed, those local variables retain the values they previously had in the prior invocation of the PROCEDURE
. The values of those local variables, though retained, are inaccessible to code outside of the PROCEDURE
,
because the compiler enforces scopes of variables. (In C code, this
would be the same thing as saying that every local variable of every
function is automatically declared as static
.)
Very important: Any (or even all) parameters at the trailing end of the calling list of a PROCEDURE
can be omitted from when calling the PROCEDURE
, and if omitted, they retain the same values as the last time the PROCEDURE
was invoked or the values previously assigned to those parameters from within the PROCEDURE
itself! (In essence, this is like saying that parameters of a PROCEDURE
are not passed to the PROCEDURE
, but rather that they are just aliases for some set of global variables dedicated to the PROCEDURE
.) That's so weird that we need to see an example. Consider the following XPL PROCEDURE
definition, and CALL
s to it:
weirdo:
procedure(x, y, z);
declare (x, y, z) fixed; /* Declare x,y,z as integers */
output = x || ' ' || y || ' ' || z; /* Print out x,y,z */
x = 29;
y = y + 1;
end weirdo;
call weirdo(1, 2, 3);
call weirdo(4, 5);
call weirdo(6);
call weirdo();
call weirdo;
The five calls successively print out the following:
1 2 3
4 5 3
6 6 3
29 7 3
29 8 3
PROCEDURE
s cannot be recursive, either directly or indirectly.
Taking these facts altogether, XCOM-I implements both parameters and local variables of PROCEDURE
s essentially as global variables in they way they are stored: i.e., each parameter and each local variable of each PROCEDURE
has its own static
address (in the global memory model), assigned at compile time and
unchanging thereafter. The compiler enforces the logical scoping
of these variables.
Regarding the RETURN
statement, McKeeman explains that it is used to exit from a PROCEDURE
and optionally to return a value. Furthermore, the calling code can
either use that return value or else ignore it. Which makes perfect
sense. But as usual, there are some documented and undocumented
peculiarities to the RETURN
statement as well:
PROCEDURE
s even if the PROCEDURE
has no RETURN
statement, or the RETURN
statement specifies no value. In this case, we are told, the return
value is simply some unpredictable value from some unspecified
System/360 register. Which is rotten, of course, but so what? This is
never going to happen, right? Wrong! Actual XPL code does this from
time to time. XCOM-I, on the other hand, always returns a well-defined value from a PROCEDURE
, whether or not there are any RETURN
statements specifying a return value; the returned value in this case is 0 if FIXED
, a BIT
value of the appropriate width evaluating to 0, or else the empty string for a CHARACTER
.RETURN
statements may exist at the global level, outside of the scope of any
procedure, and may return a value when they do. But they can. XCOM-I
treats these as exits from the program back to the operating system,
with the returned value being the program's exit code. It thus expects
the return value to be a program status code.Compound statements in XPL are groupings of simple statements (such
as assignments or if-then-else statements) enclosed within a DO ... END
block:
DO ...;
... simple statements ...
END;
There are five different kinds of DO ... END
blocks. First, there is a mere grouping:
DO;
... simple statements ...
END;
Then there are 3 different kinds of loops:
DO COUNTER = START TO END [BY STEP];
... simple statements ...
END;
DO WHILE CONDITION;
... simple statements ...
END;
DO UNTIL CONDITION;
... simple statements ...
END;
Note that DO UNTIL
is new in XPL/I and is not present in standard XPL.
In these loops, COUNTER
, START
, END
, and the optional STEP
are all integers. STEP
defaults to 1, but must be positive. START
, END
, and STEP
may be expressions, but if so they are evaluated only a single time, at
the start of the loop, and are not reevaluated thereafter. CONDITION
,
on the other hand, is an expression evaluated on each loop; it is
treated as "true" if its least-significant bit is 1, or "false" if its
least-significant bit is 0. When I say they are "integers", I don't mean that they are necessarily FIXED
; they could also be BIT(
≤32)
, which evaluate to integers.
The fifth kind of DO ... END
block is:
DO CASE EXPRESSION;
STATEMENT0;
STATEMENT1;
STATEMENT2;
...
END;
The EXPRESSION
must also evaluate to an integer. If 0, then STATEMENT0
is executed; if 1, then STATEMENT1
is executed; and so on. At most, a single statement is executed, and
there is no "fall through" from one statement to the next. If the EXPRESSION
is negative or beyond the number of available statements, McKeeman
tells us that "a random jump is executed". In XCOM-I, no statement is
executed under those circumstances, and control passes to the next
statement after the END
.
ESCAPE
and REPEAT
The ESCAPE
and REPEAT
keywords appear to be undocumented XPL/I features not present in
standard XPL. Unfortunately, from the available material I can't think
of any way to be sure what they do, so I can only speculate.
ESCAPE
appears in two different forms:
ESCAPE;
ESCAPE LABEL;
It should be noted that the HAL/S language has the keyword EXIT
, which also has these two forms. (See Ryer, p. 5-12.) EXIT
has the following behavior in HAL/S:
EXIT
;
— Exits from the innermost enclosing DO ... END
block. I.e., it's essentially a GO TO
to just after the closest enclosing END
.EXIT LABEL;
— Exits from an enclosing DO ... END
that isn't necessarily the innermost one, but rather the one which instead has the specified LABEL
attached to it. By "attached to it", I mean that they're directly adjacent, as in "LABEL: DO ...
".
Until a more-plausible explanation comes along, my assumption is that
in XPL/I has the same behavior as HAL/S's ESCAPE
EXIT
.
For example, consider the following XPL/I code:
...
MYBLOCK:
DO ...
...
DO ...
...
ESCAPE; /* Escape #1 */
...
ESCAPE MYBLOCK; /* Escape #2 */
...
END;
/* Escape #1 comes here! */
...
END;
/* Escape #2 comes here! */
...
REPEAT
also appears in those same two forms:REPEAT;
REPEAT
LABEL;
REPEAT
keyword in HAL/S as well. In this case, though, from examining
the actual usage in HAL/S-FC source code, I believe that while the
keyword has a similar behavior in HAL/S and XPL/I, there is
nevertheless a distinction between them. Here's what I believe the
behavior is in XPL/I:REPEAT;
— "Repeats" the smallest enclosing DO ... END
. In case the smallest enclosing DO ... END
is a loop — DO WHILE
or DO UNTIL
or DO I = X TO Y
— saying that it "repeats" has a pretty clear meaning: XPL/I REPEAT
is like a Python or C continue
statement. On the other hand, if the immediately-enclosing DO ... END
is not a loop, the expected behavior is less clear; nor are there any instances of REPEAT
in non-loops in legacy XPL/I code from which we might get a clue. XCOM-I implements REPEAT
without a label simply as a jump to the beginning of the block, which
means that you could form an infinite loop if there were no other code
(like ESCAPE
or GO TO
) to exit the loop.REPEAT
LABEL;
— Breaks out of inner loops as needed, until reaching an enclosing DO ... END
loop that has the attached LABEL
. That's the block that it repeats.Aside: The way my XPL/I implementation is different from HAL/S is that in HAL/S,Note: NeitherREPEAT
(without a label) goes to the beginning of the innermost enclosing loop (DO WHILE
orDO UNTIL
orDO I = X TO Y
) rather than the innermost enclosingDO ... END
. Which makes sense, since that's what you'd normally want.
ESCAPE
nor REPEAT
accepts a label attached to an arbitrary statement. It must be a label attached to a DO ... END
block, and it must at some level enclose the ESCAPE
or REPEAT
statement itself.
For XPL code such asit appears to be undocumented what valueDO I = 1 to 100;
...
END;
I
holds after the loop ends. Not all computer languages handle this
in the same way. My considered opinion is that the loop counter
holds the value at which the condition for continuation fails: in this
example, 101. Or, if the loop is broken prematurely (as with ESCAPE
from the preceding section), then the value the counter held when the ESCAPE
occurred is retained.Aside: This would match the behavior of C. Whereas in Python, for example, the counter for an equivalent loop would have the value 100 rather than 101 after normal termination of the loop.
The topic of program options is a confusing one, because there are
many different programs under discussion, and all or most of them have
their own sets of program options, or sometimes even their own unique
ways of specifying those options. There's XCOM-I, whose program options you can discover with the command "XCOM-I.py --help
". Then too, when you use XCOM-I to compile an XPL or XPL/I program, XCOM-I endows the executables it produces with a uniform set of options; if (say) you had compiled an XPL program such as MyProgram.xpl, via a command like "XCOM-I.py MyProgram.xpl
", you could find out what MyProgram's uniform options were with the command "MyProgram --help".
But that tells us only about options that are the same in all
programs of the XCOM-verse. What about options that influence the XPL
or XPL/I program itself in program-specific ways? What options specific to MyProgram.xpl are there, and how can those options be selected? What options did (say) HAL/S-FC have originally, and how are those options applied to it now?
Originally, programs run on the IBM 360 would have been associated
with Job Control Language (JCL) cards that (among other things) would
have set the program options via what's called the "PARM field". For
example, in these JCL "EXEC" cards,
the PARM field is// XPL EXEC PGM=MONITOR,
// PARM='SYTSIZE=1800,REFSIZE=2000,LISTING2,$I,$V,$U,$W'
...
'SYTSIZE=1800,REFSIZE=2000,LISTING2,$I,$V,$U,$W'
. Of course, we have no JCL in our XCOM-I framework, but the equivalent would be the --parm
command-line option, à laSuch a PARM field consists of individual options (or parameters), separated by commas (and no spaces surrounding the commas). So in this example, the parameters being selected for MyProgram areMyProgram --parm='SYTSIZE=1800,REFSIZE=2000,LISTING2,$I,$V,$U,$W'
As for how these parameters get from the PARM field to the XPL or XPL/I program itself — i.e., how do you write XPL code to access the parameters? — there is unfortunately no standardized method. In the vision of A Compiler Generator, a vision which Intermetrics shared, there's an assembly-language program called the submonitor, separate from MyProgram, that parses the PARM field to find the individual parameters and decides what to do with them. Some of the parameters may influence the submonitor itself, and never even be made available to the XPL program at all. Others may be used to set the values of variables used by the XPL program. For example, McKeeman (p. 251) suggests some "possible" parameters, and McKeeman's sample submonitor program (XMON.bal) does various things with parameters of those suggested names. Indeed, I think all of McKeemans suggested parameters affect only the submonitor, which doesn't exist at all in XCOM-I, so none of these suggested options are implemented in XCOM-I. Or at least, not supported in the same way, as you'll see some or all of them reappear in a little bit.SYTSIZE=1800
REFSIZE=2000
LISTING2
$I
$V
$U
$W
As you can see, the first thing such a typical run does is to print the PARM field as is, then then to print a list of the individual parameters in effect. Parameters are categorized as either "Type 1", which means that they are simple on/off settings, or else as "Type 2", which means that they accept numerical or string values. For example, the option allowing for production of HALMAT by HAL/S-FC's first pass is of Type 1: It is turned "on" with the parameterHAL/S REL32V0 T H E V I R T U A L A G C P R O J E C T MAY 17, 2024 16:43:5.39 PAGE 1
HAL/S COMPILER PHASE 1 -- VERSION OF MAY 17, 2024. CLOCK TIME = 9:15:17.00.
TODAY IS MAY 17, 2024. CLOCK TIME = 16:43:5.39.
PARM FIELD: TITLE=T H E V I R T U A L A G C P R O J E C T
COMPLETE LIST OF COMPILE-TIME OPTIONS IN EFFECT
*** TYPE 1 OPTIONS ***
NOADDRS
NODECK
NODUMP
NOHALMAT
NOHIGHOPT
NOLFXI
NOLIST
NOLISTING2
NOLSTALL
NOPARSE
NOREGOPT
SCAL
NOSDL
NOSREF
NOSRN
NOTABDMP
TABLES
NOTABLST
NOVARSYM
ZCON
*** TYPE 2 OPTIONS ***
BLOCKSUM = 400
CARDTYPE =
COMPUNIT = 0
DSR = 1
LABELSIZE = 1200
LINECT = 59
LITSTRING = 2500
MACROSIZE = 500
PAGES = 2500
SYMBOLS = 200
TITLE = T H E V I R T U A L A G C P R O J E C T
XREFSIZE = 2000
*** NO LANGUAGE SUBSET IN EFFECT ***
HALMAT
or turned "off" with the parameter NOHALMAT
; its default is "off" (NOHALMAT
).
(Don't worry about the fact that you don't know what "HALMAT" is, or
why one might want to produce it.) Whereas the option that
overrides the company name in the page heading is of Type 2: Its
default value is an empty string, meaning "don't override" the default
company name, but in this case I've overridden it via a PARM field of --parm='TITLE=T H E V I R T U A L A G C P R O J E C T'
.
Notice that parameters whose values are strings are not enclosed in
quotes, which might mean that a string parameter such as this one cannot
contain a comma.Aside: There are several exceptions too. The parameterI should note that not all of the parameters are documented ... though I suppose by now you probably shouldn't be very surprised by that. There's pretty good documentation of quite a few of them in the HAL/S-FC User's Manual (see Section 5.0), and some less-reliable documentation in IR-182 Sections 13.4 and 3.3, which I assume you'll be able to read without my predigesting it for you. (That's a lie, by the way; I just don't understand all of them well enough to regurgitate them, or else I'm sure I would!)LFXI
/NOLFXI
is categorized as "non-printable" and yet special code was apparently inserted into HAL/S-FC to make it print anyway; the comments in the code explicitly say so. The same section of code, on the other hand, explicitly prevents printing of the parametersQUASI
andTRACE
, even though they're categorized as printable.
Aside: I will comment on one of the parameters, though, namely theParameters are delivered to the XPL/I program via theTITLE
parameter. As I mentioned above, it's used to change the name of the company name in the page headings of printouts. The default company name is'T I T A N S Y S T E M S C O R P .'
The program comments indicate that it was changed in March of 2002 from "Intermetrics". Why? According to Wikipedia, Intermetrics went through a sequence of mergers, acquisitions, and name changes, at one point of which was a merger with the Titan Corporation in March of 2000. So I guess that a couple of years after that, somebody was finally instructed to enshrine this change in the HAL/S-FC source code. Naturally, that only lasted a couple of years before Titan was acquired by L-3 Communications. I wonder if the name was ever updated after that? Our copy of HAL/S-FC seems to be from mid-2005, so presumably the final name change wouldn't have had a chance to percolate into the source code yet by that point.
MONITOR(13)
function. (See here
to read about the precise mechanism.) In brief, there are several
"options processors" available to parse the PARM field and provide the
parsed contents to the XPL/I program. These options processors are
specific to the XPL/I program you're compiling. The existing
options processors built into the runtime library go by the namesMONITOR(13)
, but instead receive command-line parameters from PASS1 via COMMON
memory.]MONITOR(13)
, one
of those options processors is selected; there's no mixing-and-matching
allowed. We'll get to the topic of what to do if none of these
meets your needs in a moment.Note: In the Type 1 sections of the table, what's given are the parameters to enable the settings. The parameters to disable the settings are the same, but prefixed by "NO"; for example, LISTING2 vs NOLISTING2. Many of the parameters (both Type 1 and Type 2) also allow you to use an abbreviated nickname; for example, if you're too lazy to type LISTING2, you can instead use the nickname L2. If you want to negate a parameter nickname, you can prefix it with "N" (not "NO"); for example L2 vs NL2. In other words, LISTING2=L2 and NOLISTING2=NL2. Moreover, each parameter has a default value that's used if you don't explicitly specify a new one. In the table below, the parentheses after the parameter names give both the default values and the nicknames (if any).
COMPOPT PFS |
COMPOPT BFS |
COMPOPT 360 |
LISTOPT | MONOPT |
---|---|---|---|---|
Type 1, Printable |
||||
DUMP (off, DP) |
DUMP (off, DP) | DUMP (off, DP) | TABLST (off, TL) |
DUMP (off, DP) |
LISTING2 (off, L2) |
LISTING2 (off, L2) |
LISTING2 (off, L2) | TABDMP (off, TBD) |
LISTING2 (off, L2) |
LIST (off, L) |
LIST (off, L) |
LIST (off, L) | ALL (off) |
ALTER (off) |
TRACE (on, TR) |
TRACE (on, TR) |
TRACE (on, TR) | BRIEF (off) |
|
VARSYM (off, VS) |
VARSYM (off, VS) |
DECK (off, D) | ||
DECK (off, D) |
DECK (off, D) |
TABLES (on, TBL) | ||
TABLES (on, TBL) |
TABLES (on, TBL) |
TABLST (off, TL) | ||
TABLST (off, TL) |
TABLST (off, TL) |
ADDRS (off, A) | ||
ADDRS (off, A) |
ADDRS (off, A) |
SRN (off) | ||
SRN (off) |
SRN (off) |
SDL (off) | ||
SDL (off) |
SDL (off) |
TABDMP (off, TBD) | ||
TABDMP (off, TBD) |
TABDMP (off, TBD) |
ZCON (on, Z) | ||
ZCON (on, Z) |
ZCON (on, Z) |
FCDATA (off, FD) |
||
HALMAT (off, HM) |
HALMAT (off, HM) |
SCAL (on, SC) |
||
REGOPT (off, R) |
SCAL (on, SC) |
MICROCODE (on, MC) |
||
MICROCODE (on, MC) |
MICROCODE (on, MC) |
SREF (off, SR) |
||
SREF (off, SR) |
SREF (off, SR) |
QUASI (off, Q) |
||
QUASI (off, Q) |
QUASI (off, Q) |
|||
TEMPLATE (off, TP) |
REGOPT (off, R) |
|||
HIGHOPT (off, HO) |
TEMPLATE (off, TP) |
|||
HIGHOPT (off, HO) |
||||
Type 1, Unprintable |
||||
PARSE (off, P) |
PARSE (off, P) |
PARSE (off, P) |
X3 (off) | |
LSTALL (off, LA) |
LSTALL (off, LA) |
LSTALL (off, LA) |
X4 (off) | |
LFXI (on) |
LFXI (on) |
LFXI (on) |
X5 (off) | |
X1 (off) |
X1 (off) | X0 (off) | X6 (off) | |
X4 (off) | X4 (off) | X1 (off) | X7 (off) | |
X5 (off) | X5 (off) | X2 (off) | X8 (off) | |
XA (off) | XA (off) | X3 (off) | X9 (off) | |
X6 (off) | X6 (off) | X4 (off) | XA (off) | |
XB (off) | XB (off) | XA (off) | XD (off) | |
XC (off) | XC (off) | X6 (off) | XE (off) | |
XE (off) | XF (off) | XB (off) | XF (off) | |
XF (off) | XC (off) | XG (off) | ||
XE (off) | XH (off) | |||
XF (off) | XI (off) | |||
XJ (off) | ||||
XK (off) | ||||
XL (off) | ||||
XM (off) | ||||
XN (off) | ||||
XO (off) | ||||
XP (off) | ||||
Type 2, Printable |
||||
TITLE ("", T) |
TITLE ("", T) | TITLE ("", T) | TITLE ("", T) |
LINECT (59, LC) |
LINECT (59, LC) |
LINECT (59, LC) | LINECT (59, LC) | LINECT (59, LC) |
PAGES (250, P) |
PAGES (2500, P) |
PAGES (2500, P) | PAGES (250, P) |
PAGES (10000, P) |
MIN (50000) |
SYMBOLS (200, SYM) |
SYMBOLS (200, SYM) | SYMBOLS (200, SYM) |
LIST (1, L) |
MAX (5000000) |
MACROSIZE (500, MS) |
MACROSIZE (500, MS) | MACROSIZE (500, MS) |
FREE (14336) |
|
LITSTRINGS (2000, LITS) |
LITSTRINGS (2000, LITS) | LITSTRINGS (2000, LITS) |
||
COMPUNIT (0, CU) |
COMPUNIT (0, CU) | COMPUNIT (0, CU) |
||
XREFSIZE (2000, XS) |
XREFSIZE (2000, XS) | XREFSIZE (2000, XS) |
||
CARDTYPE ("", CT) |
CARDTYPE ("", CT) | CARDTYPE ("", CT) |
||
LABELSIZE (1200, LBLS) |
LABELSIZE (1200, LBLS) | LABELSIZE (1200, LBLS) |
||
DSR (1) |
DSR (1) | DSR (1) |
||
BLOCKSUM (400, BS) |
BLOCKSUM (400, BS) | BLOCKSUM (400, BS) |
||
MFID ("") |
OLDTPL ("", 0) |
COMMON
memory, in some pre-digested form.$|
"
depends on its particular punch-card column in which it appears. The list below has been merged
from McKeeman (p. 148), IR-182-1 (section 13.4), and SUNY XCOM 4.5
source code:$B
— Interlist code bytes in hexadecimal.$D
— Print compilation statistics and symbol table at the end of compilation (initially enabled).$E
— Interlist emitted code (assembly format) and data.$I
— Print Impact
summary, indicating variables outside the scope of any procedure which
were referenced, plus procedures called. (Default = Off.)$L
— List the compiled program (listing is initially enabled).$M
— List program without auxiliary information (speeds compilation by minimizing string storage usage).$N
— Produce a warning message if a procedure is called with fewer actual than formal parameters.$Q
— This toggle seems to have been available for a while and then discontinued. It may have caused compilation to terminate after processing the library file.$R
— Collect cross-reference data for each symbol (based on statement numbers) and print with symbol table. (Default = On.)$S
— Dump symbol table at the end of each procedure, if any local data is declared. (Default = Off.)$T
— Begin tracing execution of XCOM at this point, during compilation.$U
— Terminate tracing of XCOM.$V
— Expand variable
cross reference to include names of procedures referencing data and
names of procedures calling other procedures. (Default = Off.)$X
— Do not abort compilation when ceiling on count of severe errors has been exceeded.$Y
— Use '|'
as margin marker rather than 'I'
.$Z
— Allow the compiled program to execute in spite of severe errors.$|
— Set margin.
The portion of succeeding cards starting from the column containing the |
will be ignored. Note that this can only be used to make the card
width shorter than 80 columns.$
'
are
accepted (in the sense that there's no error message or abend) in PARM
fields, but
ignored. However, in XCOM-I, there is a command-line option that can be made
to serve the same purpose as control toggles in the PARM field:
The --extra=N,'STRING'
option causes the STRING
to be returned to the XPL program upon the first invocation of INPUT(N)
; subsequent invocations of INPUT(N)
get data from the file attached to input N
. So for example, if you wanted to use the control toggle $E
, but didn't want to modify your XPL source code to add it into a comment, you could use the command-line option --extra=0,'/* $E */'
.Aside: Again, to be clear, XCOM-I itself has no cognizance whatever of control toggles. Control toggles, if available, are provided to and used by legacy versions of XCOM which XCOM-I may have compiled for you.If you're writing new XPL or XPL/I programs, of course, then these specific options processors aren't going to satisfy your needs. There are still ways you can process options for your own custom purposes. I don't want to go into great detail, since it's unclear who (if anybody) will actually be producing any new XPL programs, so here I'll just give it the once-over-lightly. There are three different approaches of greater or lesser difficulty and (in)convenience.
Aside: Several of the possible approaches involve providing your own C code to supplement the C code output by XCOM-I when it translates your XPL or XPL/I source code. That can be done viaCALL INLINE
statements in your XPL/I source code, but it's rather cumbersome to do so. A more-elegant way is to instead create an entire C source-code file, and then just place it into the folder of C code created by XCOM-I. Such a file should have an '#include "runtimeC.h"
' in it, and proceed from there. If it involves any active code, versus compile-time initialization of variables, you'll still have to have aCALL INLINE
in your XPL/I code to actually call the new C function you've defined.
Very important: Any custom file you put into XCOM-I's output folder for any given project will be deleted the next time XCOM-I builds that project. So make sure it's not your only copy! In this sense, using a filesystem link to your custom C file is safer than using the file itself.
PARM_FIELD
returns a string containing the value of the --parm
command-line switch as-is. You could therefore just write XPL code to parse PARM_FIELD
. This will be your easiest course of action.--parm
command-line switch, using the same "PARM"/"NOPARM" syntax as the other
options processors. If you look at the source-code file
XCOM-I/runtimeC.c, you'll find several instances of a datatype called optionsProcessor_t
,
which is a structure whose definition you can examine in the file
runtimeC.h. Basically, it's just a structure listing all of the Type 1
and Type 2 parameter names, their abbreviated nicknames, the negations
of both types of names, the default value of the parameter, and a
"mask". I'll explain what a mask is in a moment. I hope it will be
reasonably clear from looking at those examples just how to make your
own custom optionsProcessor_t
. If you can do that, you'll be ready for XPL/I calls to MONITOR
of the form "CALL MONITOR(13, 'USEROPT');
". Here, 'USEROPT'
is literal; you don't get a choice of what to call it. The C code you need to create to do all of that doesn't require any CALL INLINE
statements, and looks something like the following:#include "runtimeC.h"
optionsProcessor_t myOptionsProcessor = {
...
};
USEROPT = &myOptionsProcessor;
argv[]
and argc
.
This may be the hardest approach of all, since as you'll recall, XPL
variables are not modeled as C variables. To get any parsed parameters
into XPL variables requires numerically knowing the addresss at which
the variables reside in the simulated IBM 360 memory (C array memory
),
as well as knowing the names of the C functions to convert the values
into IBM 360 formats. While there's only a handful of such functions (putFIXED
, putCHARACTER
, ...), I have not documented them as of yet, nor indeed any of the functions used privately by XCOM-I's runtime library.As far as the "mask" is concerned, among the items returned by the MONITOR(13)
function is a 32-bit (FIXED
) value referred to as the "options code" or the "option bits". Each type 1 parameter sets or resets (or can
be made to set or reset) one bit position in the options code. That's
what the "mask" is: A specification of the bit (or combination of bits)
the parameter sets. For example, the LISTING2 parameter is associated
with mask 0x00000002, and therefore will bitwise OR the options code
with 2; whereas NOLISTING2 would bitwise AND the options code with ~2
(0xFFFFFFFE). Or you can just use a mask of 0x00000000, and your new
parameter(s) will leave the options code untouched.
Standard XPL has a variety of so-called "built-ins", comprising
runtime-library functions callable from XPL code. Some of these
bullt-ins can appear on either the right-hand or left-hand side of
assignments, and some have to be CALL
'd like user-defined PROCEDURE
s.
XPL/I has roughly the same built-ins, plus-or-minus a few, mostly (but
not entirely) defined to have the same functionality. The compiler
recognizes these built-ins, and there is no need for them to be declared
in any way prior to use. Since these built-ins were mostly written
originally in IBM System/360 basic assembly language, the runtime
library supplied with XCOM-I has been entirely written in C, without any
reference to the original runtime-library source code.
The list below is from McKeeman (p. 140-142), with some alterations
due to XPL/I, and some hopefully-helpful notes from me. The parameter
descriptions in the list below identify the datatypes of parameters by
the following convention:
FIXED
— NE
x
CHARACTER
descriptor —
DEx
V
I suppose I should make it clear that the XCOM-I environment
is not precisely like that envisaged in the original XPL language as
confined to an IBM 360 runtime environment, and as such, built-in
functions don't work exactly the same way either. What's described here
is how the XCOM-I runtime library's functions corresponding to the original built-in functions work.
Library Function |
Description From Original Documentation |
Additional Notes |
||||
---|---|---|---|---|---|---|
ABS(NE)
|
This function returns the absolute value of NE .
(Note: "80000000", the maximum negative number, has no representable
absolute value and returns "7FFFFFFF", the maximum positive number.) |
(XPL/I only.) |
||||
ADDR(V) |
A function with numeric value which is the (at most) 24-bit absolute address of the variable (subscripted or not) V . Mostly used in conjunction with COREWORD or COREBYTE . |
If the variable has a subscript, then the
returned address is that of a specific array element, rather than of the
beginning of the array. For a CHARACTER or long BIT string (i.e., for BIT(33) through BIT(2048) datatypes), the returned address is that of the descriptor
for the variable, and if you want to find the actual data, you then
must first fetch the value of the descriptor and then massage it
further. For XPL/I's BASED variables (see IR-182-1 p. 13-3), using the unsubscribed name of the variable for V will return the address in memory where the pointer to the BASED variable's data is stored; whereas adding the subscript (0) to V returns the address of the variable's data itself.There are two exceptional cases, due to the fact that unlike the original XPL compilers, program code in XCOM-I exists in a separate memory space outside of the simulated XPL memory in which variable exist. Therefore, attempts to use ADDR(V)
to find addresses in code space, such as addresses of XPL built-ins or
of program labels, have to be treated very differently than addresses of
variables, if sensible results are to be obtained.Insofar as addresses of runtime-library functions are concerned, the only case I'm aware of is ADDR(DESCRIPTOR) . In this case, ADDR returns the address immediately following COMMON memory. The reason for this ad hoc behavior is that the XPL/I memory-management system (SPACELIB) uses the location of the built-in variable DESCRIPTOR (see below) to determine the cutoff between COMMON (in low memory) and non-COMMON
(in higher memory). But I realized this fact too late in the
development process, after I had made decisions about implementation and
memory layout that were incompatible with the interpretation of DESCRIPTOR as a normal variable with the desired properties.Aside: It isn't terribly difficult to undo the incompatible design decisions I mentioned, and sometime in the future I may do so. Technically, I think it's mainly a matter of swapping memory regions 2 and 3. Again technically, it is necessary for the dope vectors of all BASED variables to reside in the lowest 64K of memory, so it would have the effect of reducing the possible number of non-Insofar as program labels are concerned, as far as I know, there's no mechanism in XPL itself for using the memory address of a program label. Consequently, the only reason for finding the address of an XPL program label is that assembly language (in a CALL INLINE ) wants either to branch to that address, or else to somehow use the data in another CALL INLINE at that address. But this notion is based on various assumptions that are false in XCOM-I. Nevertheless, in spite of that fact, we still need some uniform method of handling such "addresses" that allows ADDR and CALL INLINE to interact smoothly. XCOM-I handles addresses of XPL labels as follows: The labels in any PROCEDURE are assigned a predictable sequence of unique negative
addresses. Thus the first label encountered has the address -1,
the second has address -2, and so on. The addresses are local to
the PROCEDURE , and restart at -1 in each PROCEDURE . ADDR returns these negative addresses. C implementations of the branch instructions in CALL INLINE statements can exploit that by having a jump table of a form similar to:Admittedly, this is not a terribly-efficient solution, but C has no practical computed-goto other than this, as far as I'm aware.switch (address) { | ||||
BYTE(DE,NE) |
A function with numeric value given by the NE th 8 bits of the string described DE . |
By "string", McKeeman means either CHARACTER data or BIT data. As far as BIT data is concerned, this function works for any data-width; i.e., for BIT(1) through BIT(2048) , and not just for "long" BIT string. However, there's some difference between how CHARACTER and BIT data is treated. BIT data is retrieved from memory as-is, whereas CHARACTER
data is transparently translated between EBCDIC encoding (in memory) vs
ASCII encoding (for manipulation) by the software. The latter
behavior is specific to XCOM-I, and wouldn't have been needed by the original XCOM, since back then, all CHARACTER
data was encoded in EBCDIC all the time, whether or not it resided in
"memory". Moreover, if you try to do so, it is possible to fool XCOM-I's BYTE
function into thinking an area of memory is a different datatype than
it really is, thus defeating the translation mechanism ... so please
don't try to do that.Note that BYTE can appear
either in an expression (such as on the right-hand side of an assignment), in which case it returns a value as just
described, or else on the left-hand side of an assignment, in which case
it modifies the value stored in memory. McKeeman recommends not using BYTE on the left-hand side of an assigment, due to the possibility of unintended consequences.The documentation does not explain what's supposed to happen if NE <0 or NE >=len( DE ) . Alas, that's not a theoretical question, because such uses of BYTE really appear in legacy XPL code. Given the sloppy subscripting accepted by XPL, my guess is that BYTE just grabs whatever happens to reside wherever NE
leads it. Unlike the usual sloppy subscripting in XPL code, this
would almost certainly be a mistake by the programmer, because the
location of string data in memory is dynamically assigned and liable to
change during the course of execution, so the programmer can't really
know reliable what lies beyond the bounds of a string. Thus my
guess is that when NE is out
of bounds for the string size, it's probably a mistake on the part of
the program that was never detected because it never produced any error
messages.Aside: For example, it happens in theWhether or not that's a true in interpretation, in XCOM-I the BYTE function returns a value of 0 (corresponding to an EBCDIC NUL ) if NE is out of bounds, or else silently does nothing at all if on the left-hand side of an assignment. |
||||
BYTE(DE) |
Same as BYTE(DE,0) |
|||||
CLOCK_TRAP |
Not supported in XCOM. | And it's not supported in XCOM-I either. |
||||
COMPACTIFY |
A procedure called automatically to repack the free storage
area for string data when it is exhausted. Calls can be triggered by || , INPUT , number-to-string conversions, or an explicit call:
The variables FREEBASE , FREELIMIT , FREEPOINT , DESCRIPTOR , and NDESCRIPT are used by COMPACTIFY .
|
The "free-storage area" is where the data for CHARACTER and long BIT -string
variables is stored. As a program which involves string variables
executes, the strings tend to change size and move around in the
free-storage area, thus causing unused "holes" in memory to
develop. COMPACTIFY repacks the area so as to remove the holes.Aside: While longXPL/I's BASED variables are stored in a separate memory region and do not participate in garbage collection by COMPACTIFY . See the earlier discussion of this. |
||||
COREBYTE(NE) |
A byte array identical to the IBM System/360
memory. The subscript is the absolute byte address of the byte
selected. COREBYTE may be used on either side of the replacement operator (i.e., = ). |
The function returns a FIXED
value, if used in an expression, but only the least-significant 8 bits
contain the value. Similarly, if used on the left-hand side of an
assignment, it receives a FIXED value from the right-hand side, but only stores the least-significant 8 bits at the specified memory address. The XCOM-I version of this function performs no EBCDIC translation as BYTE (see above) does, so it does not expect the data in memory to be CHARACTER data. Whether this will turn out to be an issue remains to be seen. |
||||
COREWORD(NE) |
Like COREBYTE , except the subscript corresponds to the word address in memory. Thus an assignment to COREBYTE(4) can change COREWORD(1) . |
Since XPL has only a 24-bit address space, only the least-significant 3 bytes of the "4-byte word address" NE are used. According to IR-182-1 (p. 13-3), this function differs in XPL/I as follows:According to "A COMPILER GENERATOR",I think that what's being implied by this cryptic comment is that McKeeman's COREWORD in XPL worked like this:
whereas Intermetrics's
To my way of thinking, McKeeman's description doesn't say
what IR-182-1 says it says, and unfortunately, McKeeman's clarification
that "an assignment to XCOM-I
conforms to the latter (Intermetrics) usage, with the exception that
addresses like 1, 2, 3, 5, 6, 7, 9, ... are perfectly fine: Aside: In point of fact, |
||||
DATE |
A function with the numeric value of the date, coded as(day of year) + 1000 * (year - 1900) |
McKeeman doesn't inform us of anything so mundane as the time zone to which the DATE and TIME built-ins relate, at least not in the text of the book. I have been told that McKeeman would have use local time here. The submonitor function supplying this value would appear to be called GETIME , and identical versions of GETIME are used by the XCOM3, XCOM45, and HAL/S-FC
submonitors, so if McKeeman used local time, then all other legacy
compilers available at present did so as well. Exactly how GETIME works isn't clear to me. But if what I understand from looking at the IBM System/360 Operating System MFT Supervisor
manual is correct, the IBM 360 seems to have had no conception of UTC
at all, or for that matter, of local timezones either; it simply knew
whatever time and date the operator manually input into the system
at each power-up, and what the operator was supposed to input was the local time and date.Consequently, the DATE and TIME built-ins return the local date and time, according to whatever locale your computer is set for. |
||||
DATE_OF_GENERATION |
A word variable initialized with the value of DATE during compilation of the program being run. |
See DATE above.
|
||||
DESCRIPTOR(NE) |
The description of the NE th string as a numeric value. |
This cryptic description is
referring to the fact that in the way memory is allocated for the
variables declared by XCOM, the 32-bit string descriptors for all CHARACTER and BIT(n) (n >32) variables appear consecutively in memory. By symbolically labeling the very first of those descriptors as DESCRIPTOR , and treating DESCRIPTOR as of the FIXED datatype, DESCRIPTOR
becomes an easily-accessible array of all the string descriptors.
This is useful, for example, for memory-management procedures like COMPACTIFY . The number of elements of the array is given by the built-in NDESCRIPT (see below). | ||||
EXIT
|
A procedurewhich causes an abnormal exit form XPL execution. |
In fact, XCOM-I models this a C-language exit(1) . |
||||
FILE(NE1,NE2)
|
An array-valued pseudovariable for manipulation of random-access bulk storage. Examples of its use areOne record is transferred into or out of the buffer array by the assignments shown above. FILE cannot appear on both sides of the same assignment. |
I've actually truncated McKeeman's
description of FILE , because it is so long and so much of it is specific to IBM
System/360, while being quite irrelevant to us. The key facts are
these:
DD
cards. But with programs compiled via XCOM-I, files are attached
at program startup by using the program's command-line switches.
The relevant switch is --raf , and multiple instances can be used on the same command line.The parameters of the --raf switch are:
In spite of what I said above, the XPL/I documentation (IR-182-1, p.
13-5) does describe one way to change the record size of a random-access
file once the program has begun operation, and that is via a call to |
||||
FREEBASE
|
A word variable containing the absolute address of the top of constant strings and thus the bottom of the repackable area. |
In XCOM-I, this is implemented as a function rather than a variable, and is not directly modifiable by user code. What the description from the original documentation is trying to say is that most kinds of variables — such as FIXED or BIT(1) types, or descriptors for CHARACTER or long BIT -string types, or arrays thereof, or pointers to BASED
variables — are stored in memory at low addresses determined at compile
time, and remain unchanged in address thereafter. In contrast,
data for CHARACTER variables can grow or shrink or move at runtime. FREEBASE is the boundary between those regions of memory.In XCOM-I, data allocated at runtime via the MONITOR(6,...) or MONITOR(19,...) function (see below), sometimes for storing data of for BASED variables, also appears above the FREEBASE boundary.
|
||||
FREELIMIT
|
A word variable containing the absolute address of the last usable byte in the string-data area. |
I believe that the description lies, and that FREELIMIT is the first unusable byte beyond the string-data area, rather than the last usable byte in it.In XPL, all memory beyond FREELIMIT is unused by the program. In XPL/I, the data for BASED variables is stored at FREELIMIT and above. However, the memory-management technique for is to push FREELIMIT downward in order to make room for BASED variables. So for either XPL or XPL/I, the initial value of FREELIMIT is the highest memory location used by the program. XCOM-I sets the initial value of FREELIMIT at "FFFC00" (i.e., 1024 bytes below the top of physical memory), in order to leave a small amount of space for COMPACTIFY to use as scratch memory. |
||||
FREEPOINT
|
A word variable containing the absolute address of the next free byte in the string-data area. When FREEPOINT passes FREELIMIT , COMPACTIFY must be called. |
In other words, at any given time, only the memory region between FREEPOINT and FREELIMIT is available for dynamically-allocating new free memory for BASED variables or CHARACTER variables. as the program continues to execute, dynamic memory allocation and subsequent memory-freeing may cause FREEPOINT to steadily creep upward, and the area from FREEBASE to FREEPOINT to become increasingly fully of unusable holes. Eventually, COMPACTIFY will be automatically called, repacking the data to eliminate the holes, and moving FREEPOINT downward in memory once again. |
||||
INLINE(NE1,...) |
Inserts arbitrary IBM System/360 machine code directly into the instruction stream. |
The applicability of the original
functionality has change drastically in the 55 years (as of this
writing) since A Compiler Generator was written. XCOM-I
provides three different mechanisms for supporting direct insertion of
inline non-XPL code, none of which are identical to those in
McKeeman. Thankfully, no available legacy standard XPL program I've encountered contains any CALL INLINE statements, so the difficulties presented by them are limited to XPL/I code such as HAL/S-FC.Support mechanism 1: insertion of C code You can use to insert an arbitrary string of C-language source code directly into the instruction stream. If the string contains newline characters ( '\n' ), then a single INLINE
can insert multiple lines of C code. But recall that an XPL
string can be no longer than 256 characters, so the inserted C code can
be no longer than 256 characters in any one CALL INLINE statement.Support mechanism 2: insertion of IBM 360 machine code XCOM-I will automatically translate IBM 360 machine code to C. Full-blown IBM 360 emulation is not provided, but support is available for those IBM 360 instructions used in XPL code I have personally encountered, which includes the entire source-code base of HAL/S-FC. Instructions that are not (yet) supported cause compilation to abort. This feature is not normally enabled. You must use the XCOM-I command-line switch --auto-inline to enable it.Support mechanism 3: other If --auto-inline is not
enabled (see support mechanism 2 above), any CALL INLINE statements not
caught by support mechanism 1 are commented out by default, but
"patched" with C code if a "patch file" is available. This important but far from automatic mechanism is the topic of a later section. |
||||
INPUT(NE)
|
A function with descriptor value specifying the next record on input file NE . |
I've truncated McKeeman's rather roundabout description, because I don't find it very informative as-is to more-modern readers. What McKeeman is trying to say is that any given program can have several files attached to it for so-called sequential input. You can think of these files as consisting of lines of text, and you can read them, one text-line at a time, by using the INPUT built-in function. For the original McKeeman XCOM (or the Intermetrics version of XCOM), the text in the sequential files would have been encoded in EBCDIC, but for use with XCOM-I they are encoded in ASCII.Because XPL comes from the era and computing environment it did, it expects input to be supplied on computer punch cards. As such, it expects lines of input to actually be 80 columns wide, even if they need to be padded by spaces to do so. Not all XPL programs have problems with lines of other lengths have a problem with lines of a different length, but some do. Notably, the original XCOM is one of the programs that behaves incorrectly unless it gets its way. Because of that, XCOM-I's built-in INPUT function transparently truncates lines to 80 columns or pads them 80 columns as necessary.There are up to 9 input files attached to the program, numbered, 0 through 9 — yes, I know that's 10 files, but just keep reading — and you access them via those "device numbers". For example, reads a single line from sequential file number 5 into the variable C .Important: These "sequential files" and theBy default, there is a single sequential file attached to the program, though it is attached to both device number 0 and device number 1. The default attached file is stdin , and thus any text piped into the program via a redirector (< ) on the program's command line will be available on both INPUT(0) and INPUT(1) .Aside: In case you're wondering, I don't think that the reason two separate devices are associated withThere's no explanation or obvious provision in the documentation for what happens when an end-of-file is reached, and XCOM-I makes no particular provision for it either. I suppose it must depend on the particular program. For example, some XPL programs assume that a blank line terminates a file. Other programs may look for special patterns, such as the string "EOF". XCOM-I returns an empty string for reads past the end of the file. Additional input files can be attached via the program's command-line switch --ddi :Here, F is the pathname to the file, and N
is the device number to which it should be attached. By the way,
if you want to debug your program once it has been translated by XCOM-I
to C and then the C has been compiled to an exectuable, I've sometimes
found piping input into the program via a command-line redirector (< )
to be somewhat problematic. It's handy in those situations to
override the default attachment of stdin to device 0 by using --ddi=0,F instead of piping. There is no way to change the attachments once the program is running, although there are MONITOR calls (see below) which can close attached files at runtime.In XPL/I, there's an additional alternative, in which Partioned Data Set (PDS) files can be attached for use by INPUT
instead of sequential files. A PDS is partitioned into sections, each
with its own 8-character identifying name (right-padded with blanks if
necessary). An attempted read past the end of a partition returns
an empty string.Only one partition of the PDS is available via INPUT at any given time, but (see below) can be used at any time to switch device N to the partition whose name is contained in the string called NAME . The value returned, F , is 0 if the switch was successful, or 1 if there was no such partition.A PDS is thus similar to a folder of sequential files, with the partition names corresponding to filenames within the folder. By default, no there is no PDS attached to programs. However, a PDS can be attached for input via the program's command-line switch --pdsi :Because of PDS similarity to folders, that's how they're implemented in XCOM-I. So F should be the pathname of a folder, and the partitions should be ASCII text files within the folder. N ,
of course, is the device number on which the attachment is to be
made. However, no specific partition is selected for reading until
the necessary MONITOR(2, N, NAME) call has been made. |
||||
INPUT
|
Same as INPUT(0) |
|||||
INTERRUPT_TRAP
|
Not supported in XCOM. |
Not supported in XCOM-I. |
||||
LENGTH(DE)
|
A function with the numeric value equal to the number of characters in the string denoted by the parameter. |
|||||
LINE_COUNT
|
This function returns the number of lines which have been printed on the SYSPRINT file since the last page eject. |
(XPL/I only.) "SYSPRINT" refers to OUTPUT(0) and OUTPUT(1) (see below). |
||||
LINK
|
This procedure performs the functions necessary to exit the current program phase and pass control to the next phase on the PROGRAM DD sequence, preserving COMMON data and any other dynamically allocated space which has not been deallocated. |
(XPL/I only.) This refers to the notion
that a sequence of XPL/I programs is being run via Job Control Language
(JCL), with each program passing data to the next program in
sequence. However, this LINK built-in is specific to the original computing environment, and performs no function in XCOM-I, which has a different mechanism sharing data between program passes. Refer to the extensive discussion concerning COMMON memory. |
||||
MONITOR(NE1,...)
|
Calls the "submonitor". |
The "submonitor" is a separate program from
whatever XPL or XPL/I program is being compiled and run, providing some
kine of system-specific functionality. In so far as McKeeman is
concerned, the submonitor's functionality and even it calling sequence
was unknowable, and thus the McKeeman XCOM system simply discarded all calls to the submonitor. Not so with XPL/I: There is an extensive set of functions provided by the submonitor, and the XPL/I program HAL/S-FC uses them extensively as well, so XCOM-I needs to support them. Most (I hope!) but not all submonitor functions are known from their descriptions in section 13.3 of IR-182-1. Besides which, HAL/S-FC source code contains the source code for MONITOR (written in IBM 360 Basic Assembly Language), from which additional information can be obtained.Because of the complexity of MONITOR , an entire separate section is devoted to explaining it. |
||||
MONITOR_LINK
|
A fixed array which can be used for transmission of information between a program and the submonitor. |
(See Table 8.4.6 in McKeeman.) I'm not presently aware of any need to implement this in XCOM-I. However, there are MONITOR functions (see above) whose operations remain a mystery, so there's a possibility that that could change. |
||||
NDESCRIPT(NE)
|
A word variable containing the upper bound in the array DESCRIPTOR (see above) of the descriptions. |
See DESCRIPTOR above. |
||||
OUTPUT(NE)
|
This is the analog of INPUT (see above) but for outputting textual data rather than inputting it. |
Most of the comments concerning INPUT (see above) are directly applicable to OUTPUT , or else are analogous in a very obvious way, so I'll try to confine my remarks to the clear differences between the two.The most common use of OUTPUT is to "print" to the "SYSPRINT" device. Unlike the original XCOM computing environment, in which SYSPRINT was an actual printer, in XCOM-I SYSPRINT by default refers to stdout . And by default, stdout is attached to both OUTPUT(0) and OUTPUT(1) . But there are differences between the two! Use of OUTPUT(0) is simpler, and therefore more common, but OUTPUT(1) is more flexible. Let's start with OUTPUT(0) .Output statements via OUTPUT(0) might look something like this:Thus we can print any character strings or numbers we like, and can concatenate them using the string-concatenation operator ( || ), with XCOM-I doing the work of automatically converting numbers to strings where necessary.Each use of OUTPUT(0) results in another physical line being printed. When printing messages with OUTPUT ,
since it is expected that the lines are being output to a physical
printer, the system keeps track of the number of lines being printed,
and automatically inserts page breaks in the output once a page has been
filled up. At the tops of pages, a page number and optional
headings and subheadings are also printed. The headings and
subheadings, though, are simply whatever has been previously set up, and
you can't change them or otherwise influence them using OUTPUT(0) .With OUTPUT(1) , on the other
hand, there's quite a bit of additional functionality. For
example, it isn't necessarily true that each use of OUTPUT(1) prints another line. The first character of each OUTPUT(1) isn't directly printed, but instead is a "carriage-control character" (or "ANSI control character"), as follows:
... on a line printer back in the olden times, today with XCOM-I you'll instead see the uglier but still effective ... MYTEXT ...
Note: What this explanation ofIf you experiment with OUTPUT(1) , and even to a certain extent with OUTPUT(0) , you may become very confused, because various things won't work as you expect. Or at least, I was very confused at first. To get past this confusion, you need to grasp the followingVery important point: For the computer systems most people are familiar with today, when we print a line of text, we expect that line of text to end with an implicit advance to the next line; i.e., with a "newline" character, often signified in programming languages bySimilarly to INPUT , you can attach additional sequential output files (F ) on device numbers (N ),Or for Partitioned Data Sets, For selecting partitions of a PDS, you don't use the same MONITOR call as for INPUT , but instead use:
OUTPUT commands
targets for a PDS don't immediately write data to the physical
PDS. Rather, the data being output is buffered in memory until the
MONITOR(1, N, NAME) call occurs, and the data is then written out to the selected partition in its entirety. The return value (F )
is 0 if the partition is new (i.e., if it didn't previously exist in
the PDS), while it is 1 if the partion previously existed but has now
been overwritten with entirely new contents. |
||||
OUTPUT
|
The same as OUTPUT(0) . |
|||||
PARM_FIELD
|
This function returns a character string which contains the entire parameter specification coded on the PARM= option on the EXEC card. If no PARM is specified, a null string will be returned. |
(XPL/I only.) A program of any
complexity generally has a number of options selectable at
runtime. For XPL/I programs like HAL/S-FC, the mechanism for selecting such options was originally a Job Control Language (JCL) card such as:Thus, PARM_FIELD would have had the value 'SYTSIZE=1800,REFSIZE=20000,LISTING2,$I,$V,$U,$W' in such a case.In XCOM-I, PARM_FIELD
is implemented as a function rather than a variable, but nevertheless
returns data of the kind described. Of course, there is no JCL
supplying such parameters, but the compiled program nevertheless has a
command-line option that does the same job:
|
||||
RECORD_WIDTH(V) |
(XPL/I only.) IR-182-1 doesn't mention it, but it appears to me that RECORD_WIDTH is a new built-in in XPL/I which when given the symbolic name of a BASED variable returns the variable's record size (in bytes). In HAL/S-FC's XPL/I source code (the only available example of use of RECORD_WIDTH ), I find it used only for a couple of BASED RECORD
variables, which makes sense according to the naming, but I have no way
to tell just from those few examples what the actual applicability
was. In XCOM-I, it will work for any BASED variable, RECORD or not. |
|||||
SET_LINELIM(NE)
|
This procedure establishes the number of lines
which will be printed on the SYSPRINT file before an automatic page
eject and header line will be printed. |
(XPL/I only.) See OUTPUT(NE) . |
||||
SHL(NE1,NE2)
|
A function with the numeric value given by shifting the value of NE1 left (logical shift, zeroes appear in the least significant bit position) the number of positions indicated by the value of NE2 . |
One point I found confusing for quite a while is that you'd suppose that logical shifts would operate on BIT variables, and particularly on long-BIT variables, for which XPL provides no conveniences at all for accessing individual bits. Not so! SHL operates only on FIXED values, and can only shift by up to 32 positions. BIT variables are converted to FIXED if shift operations on them are needed. |
||||
SHR(
|
Logical shift right. SHL and SHR are used in conjunction with & , | , and ¬ for masking and data packing. |
But recall that it is preferable to use ~ rather than ¬ with XCOM-I. |
||||
STRING(IDENTIFIER)
|
This function transforms the variable NE (which should be FIXED for proper usage) into a CHARACTER descriptor. NE should have the form:
|
(XPL/I only.) I admit that the description at
left is confusing. The whole concept is confusing. Here's my
take on it: I believe that STRING is used in one of two situations:
It is best to think of To make things a bit more confusing, |
||||
STRING_GT(DE1,DE2)
|
This function returns a TRUE value if the contents of string DE1 is greater than the contents of string DE2 , based on the collating sequence of the characters, irrespective of the lengths of DE1 and DE2 . Otherwise, the value is FALSE . This is functionally equivalent to padding the shorter of DE1 or DE2 with blanks and then comparing the strings. |
(XPL/I only.) If this seems confusing, it may be helpful to recall that when the XPL relational operators < , > , <= , and >= compare two CHARACTER
values, they look first at the lengths of the values, and only if the
lengths are equal do they compare the actual character data. For
example,because 'ZZZ' has only 3 characters, while 'AAAA' has 4. Presumably STRING_GT was introduced because somebody at Intermetrics didn't think that kind of behavior was great, and thus STRING_GT('ZZZ', 'AAAA') will report instead that 'ZZZ' is greater than 'AAAA' .The collating sequence in either case is EBCDIC rather than the more-usual ASCII. The primary visually-obvious consequence of this is that digits come after letters rather than before them. |
||||
SUBSTR(DE,NE1,NE2)
|
A function with descriptor value specifying the substring of the string specified by DE , starting at position with NE1 with length NE2 . |
I.e., it allocates and returns a new string created by taking a substring of length NE2 , starting at position NE1 in string DE .
|
||||
SUBSTR(DE,NE)
|
Like the above except that all characters from NE to the end of the string are taken. |
|||||
TIME
|
A function with numeric value given by the time-of-day coded as centiseconds since midnight. |
See the notes for the DATE function above. |
||||
TIME_OF_GENERATION |
A word variable initialized with the value of TIME during compilation of the program. |
TIME (see above) isn't literally available during compilation. Moreover, XCOM-I
implements this as a function rather than as a variable, which should
be transparent to the programmer. |
||||
TRACE |
A procedure,which causes activation of the instruction-by-instruction trace at runtime. |
This is described in quite a lot of detail in
McKeeman's Appendix 2. At present, it is accepted during
compilation but does nothing at runtime in code generated by XCOM-I. |
||||
UNTRACE |
A procedure,which turns off run-time trace. |
See TRACE . |
||||
XPL_COMPILER_VERSION |
An array of length 2 — actually a function in the XCOM-I runtime library — in which the first element is the major version of the XPL compiler (XCOM-I in this case), and the second element is the minor version, both in the range 0-99. |
This is just speculation on my part, based on how
an undeclared variable with this name is used when it appears a couple
of times in PASS2 of HAL/S-FC. Nearby code also provides the interesting information that the XPL version name is "TITAN--XPL". You may recall my earlier observation
that Intermetrics merged with Titan Corporation in March 2000, which is
obviously the origin of the "TITAN" portion of the name. The
other interesting point is that no effort is made to distinguish
it in any way from standard XPL. Of course, I have no inkling whatever of how the versions for the Intermetrics/Titan XPL compiler were numbered. I was tempted to set the major version to 32 and the minor version to 0, but that would only serve to obscure the fact that XCOM-I is in fact not TITAN--XPL, nor even related to it other than that I've tried to imitate it. Instead, XCOM-I will always return a major and minor versions appropriate to itself: major = 0 preliminary versions, 1 for first release, and so on. As an ad hoc measure, XCOM-I also transparently performs string substitions replacing 'TITAN--XPL', wherever it may be found, by 'RSB-XCOM-I'. |
As mentioned before, an XPL/I program obtained various services
outside what the XPL/I language proper or runtime library could provide,
by instead making requests to the separate "submonitor" program. The
mechanism was a call of the MONITOR
procedure. For XCOM-I, on the other hand, there is no separate submonitor program, and we may as well think of MONITOR
as being just another built-in runtime-library function. Well, not just any
runtime-library function. A big difference is that it provides a very
large number of functions, each one of which can require its own unique
syntax, thus necessitating a somewhat more-flexible discussion of how to
use it.
The only uniform feature among the many aspects of MONITOR
usage is that each separate function it provides is identified by a number, and such a function number is passed to MONITOR
as its first parameter. My explanations in the table below are mostly
pulled from Chapter 13 of IR-182-1, and then altered according to my
understanding (or lack thereof). Functions 24 through 32 are deduced,
poorly, from the HAL/S-FC BAL source-code file for the submonitor program (which happens to be called "MONITOR").
A number of the MONITOR(...)
functions work with what's called "IBM hexadecimal floating-point"
format, and specifically to the 64-bit (double-precision) version of
that format, as opposed to the 32-bit (single-precision) version of it.
To make the discussion more concise, I'll just refer to it as "DP
floating point".
To be perfectly clear, there is no floating-point datatype in
XPL/I, there are no floating-point literal constants, and there is no
provision whatever to make it convenient for you (the programmer!) to
hard-code such constants into your XPL source code, nor to interpret any
such hexadecimal constant you find within legacy source code. Rather,
you must somehow obtain the hexadecimal equivalents for whatever
floating-point constants you wish to use, and then hard-code those
hexadecimals into your code. For your convenience — or more accurately,
for mine — I've included a little utility called ibmHex.py that
you can use to convert back-and-forth between human-readable
floating-point numbers and DP floating point. Just run ibmHex.py --help
for instructions. This little utility can either be run in a stand-alone fashion, or else import
ed as a Python module. But I digress!
To understand DP floating point, imagine 8 groups of 8 bits each:
SEEEEEEE FFFFFFFF FFFFFFFF ... FFFFFFFF
where S
is the sign, E
is the exponent, and F
is the fraction. (SP floating point is the same, but with 3 FFFFFFFF
-groups rather than 7 of them.) The exponent is a power of 16, biased by 64, and thus represents 16-64 through 1663.
The fraction is an unsigned number, of which the leftmost bit
represents 1/2, the next bit represents 1/4, and so on. As a special
case, 0 is encoded as all zeroes.
For example, the 64-bit hexadecimal pair 0x42640000 0x00000000 parses as:
or in total, 1100100 (binary), or 100 decimal.
As in the preceding section, I want to make it clear that the descriptions given here are how the XCOM-I runtime library's MONITOR
functions work, and not how the original MONITOR
functions as confined to an IBM 360 runtime environment worked!
MONITOR Call |
Description From Original Documentation |
Additional Notes |
||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
CALL MONITOR(0,n); |
Closes output file n . |
I believe this refers to the sequential file or PDS attached for use with OUTPUT(n) , and not the random-access file (if any) attached for use with FILE(n) . |
||||||||||||||||||||||||
F=MONITOR(1,n,name); |
Assumes that a PDS is attached to output device n . Physically writes any data previously buffered in memory by OUTPUT(n) operations into the PDS's member named name . The name
parameter is a string 8 characters long, padded with blanks as
necessary. Returns 0 if the member hadn't existed previously in
the PDS, or 1 if the contents of an existing member of that name was
replaced. |
|||||||||||||||||||||||||
F=MONITOR(2,n,name); |
Assumes that a PDS is attached to input device n . Sets future INPUT(n) operations to pull data from the PDS's member named name . The name
parameter is a string 8 characters long, padded with blanks as
necessary. Returns 0 if the member was found, or 1 if it was not
found. |
The end-of-data for the member is detected when an input string of 0 length is encountered. IR-182-1 asserts that devices 4 and 7 have the following abnormal ad hoc behavior very specific to HAL/S-FC. A PDS called "INCLUDE" is normally attached to input device 4, while either "INCLUDE" or "OUTPUT6" is normally attached to input device 7. But member name is sought in device 4 or 7, then name is first sought in "INCLUDE" but upon failure is then sought in "OUTPUT6". |
||||||||||||||||||||||||
CALL MONITOR(3,n); |
Closes input file n . |
I believe this refers to the sequential file or PDS attached for use with INPUT(n) , and not the random-access file (if any) attached for use with FILE(n) . |
||||||||||||||||||||||||
CALL MONITOR(4,n,b); |
Changes the record size of random-access file n to b . Must precede the first use of FILE(n) . |
|||||||||||||||||||||||||
CALL MONITOR(5,ADDR(DW)); |
Sets the location of the double-word work area (DW ) for subsequent use by MONITOR(9,...) and MONITOR(10,...) . |
DW is meant to be literal here, at least for HAL/S-FC, and should not be replaced by anything else.DW refers to an array of FIXED
variables used in pairs, to hold 64-bit values. In particular, these
pairs are often used to hold double-precision floating-point numbers in
IBM System/360 format. MONITOR(9,op) (see
below) is then capable of performing various arithmetical operations on
those numbers. Since XPL/I itself has no facilities for
floating-point variables or operations on them, this workaround is the
only available way to employ floating-point numbers in XPL/I programs.In legacy HAL/S code, I've found that at least 14 words of memory are needed for this work area. In the XCOM-I implementation, the MONITOR(5) call is rejected within 14 words of the end of memory.
|
||||||||||||||||||||||||
F=MONITOR(6,ADDR(basedVariable),n); |
Allocates n bytes of storage in free memory, clearing that memory to 0, and assigns the pointer of the basedVariable to point to that newly-allocated space. Returns 0 upon success, or 1 upon failure. |
It would be natural to suppose that MONITOR(6) is the mechanism used to allocate memory for BASED
variables declared in the XPL/I code. In general, this is not true. Memory for BASED
variables declared in user XPL code comes from a large pre-allocated
contiguous block of memory, and the sub-allocations of individual BASED variables within that large block are managed using means other than MONITOR(6) and MONITOR(7) .Aside: It appears to me that once upon a time, the intention may indeed have been forMy observation is that MONITOR(6) must update more of the basedVariable than just its pointer to its data. Recall that a BASED variable is associated with a 28-byte structure known as a "dope vector",
of which the pointer to the data comprises only bytes 0-3. Beyond
that, bytes 4-5 give the sizes (in bytes) of each record, while bytes
8-11 give the total number of records allocated so far. (Not to be
confused with the total number of records in use; for growth, more
records may have been allocated than were strictly needed at that
moment.) While MONITOR(6)
may not need an entire dope vector to be present at the address it has
been given, it needs the first 12 bytes to be present, so that it can
use/update the fields just mentioned. I don't see evidence of that
in the submonitor source code, but then I don't fully understand that
source code; at any rate, this is how XCOM-I treats it.Note that MONITOR(6) increases the size of basedVariable by n bytes; it does not set the size to n bytes. | ||||||||||||||||||||||||
F=MONITOR(7,ADDR(basedVariable),n); |
Frees memory previously allocated via MONITOR(6,ADDR(basedVariable),n) . However, basedVariable 's pointer is not changed, and thus will continue to point to the freed area until explicitly changed. |
See the comments for MONITOR(6) . If n is less than the total number of bytes allocated for basedVariable , then the number of records available in basedVariable has shrunk, but basedVariable continues to be usable, with a reduced number f records.Although described as having the address of a BASED variable as a parameter, this isn't entirely true. It is also sometimes the address of a FIXED variable. | ||||||||||||||||||||||||
CALL MONITOR(8, dev, filenum); |
Set PDS DDNAME. |
Apparently, the use of MONITOR(8)
changed after the only documentation of it (in IR-182-1) was produced,
leaving us to try to infer its usage from context in the HAL/S-FC and MONITOR 's source code. The purpose seems to be to change the association of files to device numbers, à la DD 's in JCL, but to do so dynamically during execution rather than statically at program startup. The dev parameter, an integer, would appear to be the logical device number as used in an XPL INPUT(dev) or OUTPUT(dev) (or in a HAL/S READ(dev) or WRITE(dev) ).The filenum parameter is trickier for me to understand, because I'm unclear how DD
in JCL associates a dataset name with a file number, so take what I say
with a grain of salt. My inference, subject to change, is that if
you have a dataset identified with the DD name "INPUTn " (where n is a digit), then that is filenum=n ; whereas a dataset named "OUTPUTn " is associated with filenum="80000000"|n . Recall that "80000000" is XPL-speak for 0x80000000. So the filenum is implicit in the dataset name, and the most-significant bit is used to indicate the direction of data flow.For example, OUTPUT(4) invocations might normally output data to OUTPUT4 , but you could associate it instead with OUTPUT 2 viaand then reassociate it later withCALL MONITOR(8, 4, "80000000" | 2); OUTPUT 4 viaThe description "Set PDS DDNAME" I've given as the description comes from the associated program comment in MONITOR.bal, the source-code for theCALL MONITOR(8, 4, "80000000" | 4); MONITOR procedure.
It would seem to imply that it has something specifically do with
Partitioned Data Sets (PDS). It's difficult to see why that would
be so; I think it merely reflects the specific usage in HAL/S-FC, which
indeed relates to PDS. The XCOM-I runtime library allows it to be
used for and PDS or sequential dataset. It is not applicable to random-access files.Of course, there is no JCL, and consequently no DD statements, associated with an XPL or XPL/I file compiled with XCOM-I, nor with a HAL/S file compiled with HAL/S-FC. The description above has to be applied instead to command-line switches in a hopefully-obvious way. |
||||||||||||||||||||||||
F=MONITOR(9,op); |
Performs floating point evaluation as specified by value of op . Operands are obtained from work area whose address was set up via a MONITOR(5)
call. The first operand is taken from the first double word of the work
area and the second operand from the second double word. The result is
placed in the first double word of the work area. The return code is 0
if the operation succeeds, or 1 if the operation fails (under or
overflow).The values of op are:
|
"arg1" refers to the DP floating-point number stored in the pair of work-area valued DW(0),DW(1) ."arg2" refers to the DP floating-point number stored in the pair of work-area valued DW(2),DW(3) .The DP floating-point result of the operation is stored back into DW(0),DW(1) , although not all of the operations can necessarily produce results of full DP accuracy from DP operands.The angular unit for trigonometric operations is the radian. Note that the values stored in these registers of the work-area are in the binary format of "IBM hexadecimal floating-point", which are not used anywhere else in XPL memory (unless copied from the working area). |
||||||||||||||||||||||||
F=MONITOR(10,string); |
Performs character to DP floating-point conversion upon characters in string .
The return code is 0 if the result is valid, or 1 if conversion was not
possible. The result is placed in the first double word of the work
area provided by the MONITOR(5) call. |
In other words, string is interpreted as a DP floating-point number and stored in DW(0),DW(1) . Any of the usual representations for decimal numbers are accepted in the string, including the usual E notation for exponents. |
||||||||||||||||||||||||
CALL MONITOR(11); |
No-op. |
|||||||||||||||||||||||||
string=MONITOR(12,p); |
Converts floating-point number in the first double word of the work area to standard HAL/S character form. Value of p indicates whether the operand is SP (p =0) or DP (p =8). |
In other words, this is the inverse of MONITOR(10,string) (see above).As far as the "standard HAL/S character form" is concerned, it's described in HAL/S documentation (Programming in HAL/S, p. 3) as follows:
|
||||||||||||||||||||||||
address=MONITOR(13,name); |
Performs DELETE of current option processor and then LOADs an option processor specified by name .
The option processor loaded is called and passed a pointer to the PARM
field in effect at the time of compiler invocation. The option processor
passes the PARM field and establishes an options table (see Chapter 9
[of document IR-182-1]) whose address is passed back as a return
value. If name is a null string, the pointer to the existing options table is returned. |
See the earlier discussion concerning program options.
In brief, each XPL or XPL/I program has some defined set of "Type 1"
options and "Type 2" options. The former are binary (on/off)
options, whereas the latter are options with integer or string values.MONITOR(13,name)
returns an address which is used by the XPL program to find options settings for it that have been
set up by the run-time library. In C terms, it's the equivalent
for argv[] but with more
behind-the-scenes interpretation of the parameters. Those program
options would originally have been provided by the Job Control
Language (JCL) which initiated execution of the program, by means a
"PARM field" given on one of the JCL cards. For example, in the
JCLthe PARM field is the string//XPL EXEC PGM=MONITOR, 'SYTSIZE=1800,REFSIZE=20000,LISTING2,$I,$V,$U,$W' ,
and the comma-delimited items are the individual options
selected. Presumably, the operating system wouldn't have been able
to directly interpret the meaning of this PARM field on its own, since
the types of options would have been specific to the program being
run. So an "option processor" specific to the allowed parameters
must be loaded to perform that interpretation. In the case of
the Intermetrics version of HAL/S-FC, the particular options processor for compiler pass 1 was name ='COMPOPT ' , whereas it was an empty string for passes 2 and 3, and 'LISTOPT '
for pass 4. (Note the trailing spaces to make the non-empty
strings come out to 8 characters. These trailing spaces are not required in XCOM-I.) Again, just to be doubly clear, MONITOR(13) provides access to options settings not for XCOM or XCOM-I, but rather for the XPL program being compiled by XCOM-I, and specifically for HAL/S-FC. XCOM-I supports the pre-existing built-in options processors 'COMPOPT', 'LISTOPT' and 'MONOPT'. (MONOPT provides the parameters described in McKeeman, but they are passed to the XPL/I program in the Intermetrics manner rather than being confined to the submonitor as in McKeeman.) Plus, it is possible (as described in the discussion of program options) for the user to define their own custom options processor called 'USEROPT'. Whenever MONITOR(13) is called, it saves the name and the PARM field to a file called "monitor13.parms". If called with an empty name , or more typically as MONITOR(13,0) , it simply reloads the contents of monitor13.parms and reuses those.The address returned by MONITOR(13) is a pointer in memory to the beginning of a block of 6 consecutive FIXED values:
|
||||||||||||||||||||||||
F=MONITOR(14,n,a); |
For creating Simulation Data Files. The value of n selects a function; the value of a supplies supplementary data:
|
Unfortunately, the only documentation of this is from 1976, and it may be somewhat preliminary in nature. For example, from the source code of HALS/FC (see the OUTPUT_SDF procedure in PASS3), the function numbers (n ) appear to be 0, 4, and 8 (rather than 0, 1, and 2 as it says in the description to the neighboring column).It appears to me that it's used somewhat like this:
Thus you start with a presumably-empty PDS (which happens to be
attached to output device 5), and end up with a PDS populated with lots
of named members, each one of which consists of some integral number of
"pages". |
||||||||||||||||||||||||
I=MONITOR(15); |
Returns Revision Level and Catenation Number from last MONITOR(2)
call. Catenation number is obtained from PDS directory data and
Revision Level from user data field as specified in the HAL/SDL ICD. The
values are returned in the left and right halfwords of the result. |
I'm not fully up-to-speed as to what the
"revision level" and "catenation number" are, or how to obtain
them. Both terms refer to the properties of Partitioned Data Sets
(PDS). At present, the XCOM-I runtime library merely
returns the hexadecimal value F0F00000. As far as I know, the
values returned by this function are only for printing on compiler
reports, so the value returned may not be terribly significant. What follows is my (poor!) attempt to understand what this is supposed to be going on. For this discussion, think of PDS as being a directory on a disk drive, and think of the members of the PDS as being files within that directory.
Regarding potential future implementation of these concepts in the XCOM-I runtime library, I think it would involve prefixing the revision level to the filenames in which PDS member data is stored. |
||||||||||||||||||||||||
CALL MONITOR(16,n); |
Sets flags in byte to be returned as high
order byte of return code at end of compilation. Flags are passed as
rightmost byte of full word n . If high order bit of n is zero, flags are OR'ed into existing flags. If high order bit of n is one, flags replace existing flags. |
The flags are indeed accumulated in a global variable called flags16 internal to the XCOM-I runtime library, but are not currently used as an exit code. |
||||||||||||||||||||||||
CALL MONITOR(17,name); |
Causes name to be copied to third parm field (if any) passed to MONITOR by the program that invoked the compiler. See HAL/SDL ICD. |
The name is presently saved in a variable called programNamePassedToMonitor within the XCOM-I runtime library, but is not used for anything. |
||||||||||||||||||||||||
T=MONITOR(18); |
Returns elapsed CPU time since beginning of run in units of .01 seconds. |
|||||||||||||||||||||||||
F=MONITOR(19,addressList,sizeList); |
This is an extended form of MONITOR(6) (see above), accepting arrays of addresses of BASED variables and their block sizes. Unlike MONITOR(6) , the newly-allocated blocks of memory are not cleared to 0. |
XCOM-I does not implement this function. Besides, it's unclear how to represent a "list". |
||||||||||||||||||||||||
CALL MONITOR(20,
|
This is an extended form of MONITOR(7) (see above), accepting arrays of addresses of BASED variables and their block sizes. |
XCOM-I does not implement this function. | ||||||||||||||||||||||||
I=MONITOR(21); |
Returns remaining amount of memory (in bytes) that's free for allocation via MONITOR(6) . |
For Intermetrics XCOM, I believe, MONITOR(6) could use any contiguous block of free memory, including those that were candidates for garbage collection. Whereas XCOM-I
only allocates memory within the contiguous block at the end of the
free-memory space. Consequently, this function would originally
have found the largest free block wherever it was located, whereas XCOM-I merely reports the size of the final block of free memory. |
||||||||||||||||||||||||
|
Calls SDF access package. |
Note: This function is not yet implemented in the XCOM-I
runtime library. However, unlike other unimplemented functions
(whose use results in a runtime abend), this function can be used
without negative effect other than printing an informative message on
stderr. SDF stands for Simulation Data File. The "SDF access package" (SDFPKG) was a separate program, similar to the submonitor itself, with which the XPL/I program could communicate via MONITOR(22) . And like the submonitor, in the XCOM-I
framework, SDFPKG equivalents are built into the runtime library rather
than being provided by a separate SDFPKG program. Documentation can be found in TBD. There is not space for a full description here, so I'm only giving a brief summary; the documentation should be consulted for more details. The first form, , initializes the SDFPKG functionality. The parameter points to an array of 30 FIXED values which serves as a "communications area" holding data for subsequent calls. For calls of the second form, F=MONITOR(22,n1) , the n1 parameter specifies a function number ("mode"):
Additionally, the function number n1 may have high-order bits set, such as 0x10000006 or 0x80000007. The significance of that is TBD.The return code, , for all calls is 0 upon success, or non-zero for various failures. See section 11 of the documentation. |
||||||||||||||||||||||||
descriptor=MONITOR(23); |
Returns the descriptor of the 10-character string obtained from
the ID field of the File Control Block of the first phase of the
compiler. The ID field is maintained by the XPLZAP program and contains the identifying string printed on the header of each page of the HAL listing".
|
I think that what the original documentation was trying to get at is that HAL/S-FC's version code, printed on each of the report pages printed by XCOM (or now, XCOM-I), is given by this ID string, and that the ID string is provided to each pass of HAL/S-FC via a call to MONITOR(23) . Of course, in the modern reimplementation, there is no XPLZAP program, and it wouldn't be compatible with XCOM-I if there were one. In XCOM-I, the ID string itself is by default 'REL32V0 ', but it can be changed via the XCOM-I command-line parameter --identifier .As for what a "descriptor" is, see the description of the CHARACTER datatype. In brief, the return value is not itself the ID string, but an integer value (i.e., of the FIXED datatype) that provides an index for finding the ID string in memory. |
||||||||||||||||||||||||
MONITOR(24) |
Read a block of a load module. |
TBD |
||||||||||||||||||||||||
MONITOR(25) |
Read a mass-memory load block. |
TBD |
||||||||||||||||||||||||
MONITOR(26) |
Read a MAF (memory analysis file) block |
TBD |
||||||||||||||||||||||||
MONITOR(27) |
Write a MAF block |
TBD |
||||||||||||||||||||||||
MONITOR(28) |
Link to dump analysis service routine |
TBD |
||||||||||||||||||||||||
MONITOR(29) |
Return current page number |
TBD |
||||||||||||||||||||||||
MONITOR(30) |
Return JFCB as string |
TBD |
||||||||||||||||||||||||
CALL MONITOR(31
|
Virtual-memory lookahead service. |
This function is seemingly only used internally by the virtual-memory system. In looking at the the usage in actual XPL/I code, as well as the assembly-language source code for the submonitor (see VMEMSRV in MONITOR.bal),
which admittedly I can only follow very imperfectly, it appears
to me that this function is used in three distinct ways:
bufferAddress in MONITOR(31,bufferAddress,recnum) may
sometimes have its most-significant bit set to 1, thus turning it into a
negative number. I think in that case what happens is that the
function waits for any background read-operation already in progress to
complete, and then clears the sign bit of bufferAddress and commences a new read operation.The XCOM-I runtime library, doesn't really need any of these subtleties, so implements this function basically as immediate FILE operations, as appropriate. |
||||||||||||||||||||||||
F=MONITOR(32) |
Find out subpool minimum size |
The memory-allocation and -deallocation functions (MONITOR
6, 7, 19, and 20) are supposed to get their memory from something
called "SUBPOOL 22". SUBPOOLs apparently have block-sizes which
are either 2048 or 4096 bytes, depending on the operating system. MONITOR(32)
returns the appropriate block size for the operating system.
Given that the block size is bigger in later versions of the operating
system, I infer that bigger is better, within certain limits, and hence XCOM-I arbitrarily returns 4096. |
||||||||||||||||||||||||
MONITOR(33) |
Find out FILE max REC# and BLKSIZ |
TBD |
It is admittedly unlikely that many people will be writing new XPL or
XPL/I programs nowadays, and will instead be compiling only legacy XPL
or XPL/I programs ... assuming that I haven't already compiled all of
them first! (Which is a distinct possibility.) Since such legacy
programs will presumably all have been debugged decades before, there's
not as much need for a debugger as there is for computer languages in
which there are many active developers. Nevertheless, XPL or XPL/I
programs can be run under a debugger to a certain extent.
-ggdb
, and incorporating it can be done simply by adding it to the make command like viaHaving done this, any debugger which can accept gdb style debugging info, such as gdb itself or full-screen wrappers for gdb, can now be used for debugging. For other C compilers or other debuggers, perhaps different switches would be needed.make EXTRA=-ggdb ...
memory
.
Moreover, the format of this data in "memory" is that of the IBM
System/360 rather than the native format of the computer on which
debugging is being performed. While in my opinion, these design
choices for XCOM-I were are all necessary and unavoidable,
there's no denying that they make it trickier to use a debugger to see
the how the values of variable change during execution. CALL INLINE
feature to directly embed calls to these functions within your program,
without running a debugger at all, if it's more convenient for you to
do so.void printMemoryMap(char *heading)
— This function prints out the entire current state of the memory map.
I.e., the addresses of all variables, including dynamically-allocated
ones, and the contents of all of those variables. Of course, for an XPL
program any complexity, the printout is quite long, so this function is
presumably used sparingly. The heading
parameter is simply a message printed at the top, which can be helpful if you call printMemoryMap
several times (perhaps via CALL INLINE
) in the same program run. In a gdb console, you could run it via the command call printMemoryMap("...")
, whereas you could instead embed it in your XPL source code via CALL INLINE('
printMemoryMap("...")
');
.char *getXPL(char *identifier)
— Returns (as a C string) the value of a single XPL variable whose name is given by the identifier
. The identifier
string can be any identifier expression that's syntactically correct in
XPL/I, provided that subscripts consist entirely of decimal digits,
possibly with a leading minus sign. This includes expressions like "V"
, "V(5)"
, "B(3).V"
, or "B(3).V(-5)"
. Recall that in XPL/I, subscripts can be applied to scalar variables. If you are querying a BASED RECORD
, then be sure to include the desired field, since while getXPL
knows how to print an individual field of a RECORD
, it does not know how to print a collection of fields such as a RECORD
. In a gdb console, you could use getXPL
with a command like print getXPL(...)
. For example, print getXPL("C1(4)")
. void printXPL(char *identifier)
— This provides the same functionality as getXPL
, except that it prints its output to stdout
rather than returning it as a string. In a gdb console, you could run it via the command call printXPL("...")
, or you could instead embed it in your XPL source code via CALL INLINE('
printXPL("...")
');
.int bitBits
— By default, getXPL
and printXPL
print the data of a BIT
variable in hexadecimal notation. In contrast, legacy XPL source code has often logically partitioned the data in BIT
strings into subgroupings of 1, 2, or 3 bits (rather than 4 as for
hexadecimal), thus using literal constants (such as initializers in
declarations) that are in binary, base-4, or octal notation. In those
cases, it's difficult to relate hexadecimal strings returned by getXPL
with the literals shown in the XPL source code. The global variable bitBits
addresses this by allowing you to change the radix used for the BIT
data. By default bitBits
is 4 (hexadecimal), but can be changed to 1 (binary), 2 (base-4), or 3 (octal)
. In a gdb console, you could change via a command like set bitBits=2
.In using these debugging functions, note that they all require mangled forms of variable names and parameters of PROCEDURE
s. Mangled names consist of the names of the variables or parameters as DECLARE
d in the XPL source code, but prefixed by the names of all of the parent PROCEDURE
s. Perhaps an example would make this clearer. Suppose your XPL source code looked like the following:
DECLARE X FIXED, Y FIXED, Z FIXED;PROC1:
PROCEDURE(X, Y);
DECLARE X FIXED, Y FIXED, Z FIXED;
PROC2:
PROCEDURE(X, Y);
DECLARE X FIXED, Y FIXED, Z FIXED;
...
END PROC2;
END PROC1;
Then the mangled variable and parameter names we'd find in our memory map (and just for reference, PROCEDURE
names), as well as being used in our debugging functions, would be:
Thus while we have lots of global and local variables and function parameters namedX
Y
Z
PROC1
PROC1xX
PROC1xY
PROC1xZ
PROC1xPROC2
PROC1xPROC2xX
PROC1xPROC2xY
PROC1xPROC2xZ
X
, Y
, and Z
, their mangled names are all distinct. Aside: By the way, the lower-case 'x
' characters appearing in the mangled names have nothing to do with the fact that one of our identifiers is 'X
'. That's a coincidence. Rather, they're just convenient separators XCOM-I conventionally uses between scope names and the variable names. Recall that XPL identifiers are case-insensitive. XCOM-I translates them all internally to upper case. Hence, lower-case 'x' is not a character that can appear in unmangled identifiers or names of scopes.
Aside: In reality, such inlined code is numerical in nature, and consists of IBM 360 machine code ... whereas the term "assembly language" implies that the insertions are symbolic in nature rather than just numbers. I'll continue to call the insertions BAL, because they undoubtedly began as BAL before somebody assembled them to determine which numerical values to use. But it is certainly technically incorrect to do so.Of course, one possible approach would be to replace offending inline code directly by C code in the XPL source-code files of the program being compiled, using the property of the XCOM-I built-in
INLINE
function,In your use case, whatever that may be, perhaps that's adequate. However, the Virtual AGC Project aims to provide a way to work with legacy code — i.e., code from the past, rather than newly-written code —, and that implies being able to compile such source code as-is, without syntactical or functional modification to source code. Which means that from our point of view, irreversible modification of legacy source code is unacceptable.CALL INLINE('...arbitrary C code of no more than 256 characters...');
Aside: It is, of course, arguable the extent to which modifications are acceptable. The Virtual AGC Project allows insertion of modern comments, as long as modern comments are clearly distinguishable from original program comments. It has occasionally allowed clearly-commented simple changes of a line or two of code, when it was thought impossible for modern compilers to duplicate some undocumented behavior of the original compilers ... although improvements to the modern compilers almost always eventually allowed reversal of such modifications. My personal judgment is that replacement of dozens or hundreds of lines of code by code that's not obviously functionally identical, written in another language, is taking it farther than I'd like. Better to leave the source code the same, so that it's verifiable by the user.Don't worry! XCOM-I lets us satisfy these criteria. But to understand how, you need to know a little more about this IBM Basic Assembly Language (BAL) code that's causing the problem.
CALL INLINE
statements that we need to work around if we want to compile and run a version of XCOM45 with XCOM-I:Actually, to be clear, XCOM-I will compile XCOM45 just fine, whether we work around this embedded assembly-language stuff or not; the compiled program simply won't run correctly, because vital IBM 360 instructions will be missing in the midst of the program. As it happens, though, part of the process of creating the workaround depends on us compiling XCOM45 in advance, without any workarounds in place. So let's just do that. In the compilation process, the XPL procedureSCAN_FINDS_END_OF:
PROCEDURE(TABLE) BIT(1);
DECLARE TABLE FIXED;
DECLARE TRT(1) FIXED INITIAL("DD003000", "10000000");
/* THIS PROCEDURE ASSUMES CP = 0 WHEN IT IS CALLED */
CALL INLINE("58", 3, 0, TEXT); /* LOAD STRING DESCRIPTOR */
CALL INLINE("1B", 2, 2); /* CLEAR REG. 2 */
CALL INLINE("19", 3, 2); /* CHECK FOR TEXT = NULL STRING */
CALL INLINE("07", 8, 12); /* RETURN FALSE IF TEXT IS NULL */
CALL INLINE("8D", 2, 0, 0, 8); /* SHIFT LENGTH FIELD TO REG. 2 */
CALL INLINE("88", 3, 0, 0, 8); /* RESTORE STRING ADDRESS */
CALL INLINE("58", 1, 0, TABLE); /* LOAD BASE ADDRESS OF TRANS_TABLE */
CALL INLINE("41", 12, 0, TRT); /* LOAD ADDRESS OF TRT INSTRUCTION */
CALL INLINE("44", 2, 0, 12, 0); /* EXECUTE TRT INSTRUCTION */
CALL INLINE("05", 12, 0); /* LOAD BRANCH REGISTER */
CALL INLINE("47", 7, 0, 12, 18); /* BRANCH ON CC~=0 */
CALL INLINE("18", 1, 2); /* LOAD REG. 1 WITH LENGTH(\) - 1 */
CALL INLINE("41", 0, 0, 0, 1); /* LOAD REG. 0 WITH 1 */
CALL INLINE("1A", 1, 0); /* ADD TO OBTAIN IDENTIFIER LENGTH */
CALL INLINE("1B", 3, 3); /* PUT 0 INTO RETURN REGISTER */
CALL INLINE("47", 15, 0, 12, 24); /* BRANCH AROUND NEXT TWO INSTR*/
CALL INLINE("1B", 1, 3); /* SUBTRACT TO OBTAIN IDENTIFIER LENGTH */
CALL INLINE("41", 3, 0, 0, 1); /* LOAD 1 INTO RETURN REGISTER */
CALL INLINE("50", 1, 0, CP); /* STORE IDENTIFIER LENGTH */
END SCAN_FINDS_END_OF;
SCAN_FINDS_END_OF
turns into a C file called SCANxSCAN_FINDS_END_OF.c that contains the following harmless but useless C representation of it:Basically, XCOM-I has simply commented out each/*
File SCANxSCAN_FINDS_END_OF.c generated by XCOM-I, 2024-05-19 12:14:14.
*/
#include "runtimeC.h"
#include "procedures.h"
bit_t *
SCANxSCAN_FINDS_END_OF(void)
{
; // (0) CALL INLINE( 88, 3, 0, TEXT);
; // (1) CALL INLINE( 27, 2, 2);
; // (2) CALL INLINE( 25, 3, 2);
; // (3) CALL INLINE( 7, 8, 12);
; // (4) CALL INLINE( 141, 2, 0, 0, 8);
; // (5) CALL INLINE( 136, 3, 0, 0, 8);
; // (6) CALL INLINE( 88, 1, 0, TABLE);
; // (7) CALL INLINE( 65, 12, 0, TRT);
; // (8) CALL INLINE( 68, 2, 0, 12, 0);
; // (9) CALL INLINE( 5, 12, 0);
; // (10) CALL INLINE( 71, 7, 0, 12, 18);
; // (11) CALL INLINE( 24, 1, 2);
; // (12) CALL INLINE( 65, 0, 0, 0, 1);
; // (13) CALL INLINE( 26, 1, 0);
; // (14) CALL INLINE( 27, 3, 3);
; // (15) CALL INLINE( 71, 15, 0, 12, 24);
; // (16) CALL INLINE( 27, 1, 3);
; // (17) CALL INLINE( 65, 3, 0, 0, 1);
; // (18) CALL INLINE( 80, 1, 0, CP);
return 0;
}
CALL INLINE
statement, while replacing it with an empty statement. So why did
we bother compiling it at all? The answer is that we want the
parenthesized numbers that have been added to the comments! The
parenthesized numbers will be the same every time we compile XCOM45, unless you take it upon yourself to mess us up by removing some of the CALL INLINE
statements or adding new ones in the meantime. Pro tip: Don't do that!Aside: There is even one (!) instance in HAL/S-FC in which there is aIn this particular example, since XCOM45 has a single block ofRETURN INLINE
statement as well, and it is treated by XCOM-I the same way asCALL INLINE
s are treated: namely, it is numbered and commented out. As it happens, McKeeman does not give even the slightest hint thatINLINE
can return a value. In point of fact, I don't think the value returned by theRETURN INLINE
statement in the case I mention even comes from theINLINE
at all, but is instead generated by other means. Technically, I think that if you have a block ofCALL INLINE
s terminated by aRETURN INLINE
, then whatever value happens to remain in the CPU's R0 register will be returned, regardless of where in the block ofINLINE
s R0 was set up. But regardless of the underlying details, it demonstrates that syntactically, bothCALL INLINE
andRETURN INLINE
are possible in XPL/I source code.
Important note: If you must write a C-language patch-file for a block ofINLINE
s that include an XPLRETURN INLINE(...)
, you should use the C macroRETURN()
rather than simplyreturn
, orRETURN(something)
rather thanreturn something
. That's because theRETURN(...)
macro takes into account extra code which XCOM-I may need to insert in conjunction with XPLRETURN
statements. For example, if XCOM-I is run with its--reentry-guard
switch, then XCOM-I inserts code at the entry point and exit points of each XPLPROCEDURE
that allows the runtime library to detect illegal reentries. (Recall that recursion is illegal in XPL.) TheRETURN(...)
macro handles this for you transparently, and if a simplereturn
is used instead, then the runtime library will falsely detect recursion the second time any individualPROCEDURE
is executed.
CALL INLINE
statements, we were guaranteed in advance that the block would start with number 0. In a more-complex example like HAL/S-FC, there would be a number of blocks of inlines, strewn throughout the source code, and lieu of XCOM-I
printing them all out for us the numbering would be
unpredictable. But because the numbers never change, we can use
them as unique, persistent identifiers for the CALL INLINE
statements. CALL INLINE
statements is to create a series of numbered "patch files", written in
C: patch0.c, patch1.c, patch2.c, and so on. The patch files
are placed in the same folder that contains the source code of the XPL
program being compiled, though XCOM-I does have a command-line
option (--patch=PATHNAME
)
that allows a different folder to be specified. When XCOM-I encounters a CALL INLINE
statement, it looks for a patch file with the same number, and if
it finds one, it directly inserts the contents of the patch file into
the object code. If there's no matching patch file, then the CALL INLINE
simply remains a comment. In principle, you could have a different patch file for each CALL INLINE
, but it's usually more practical and more aesthetically pleasing to have a single patch file for each contiguous block of CALL INLINE
statements. Note: In working with the source code for HAL/S-FC specifically, it's a little more complex than what I just indicated. As I've mentioned previously, there are two different methods of compiling HAL/S-FC source code: One to create a compiler intended to compile Primary Flight Software (PFS) via XCOM-I'sIn this real-life example, since there's a single block of--cond=P
command-line switch, and one to create a compiler intended to compile Backup Flight Software (BFS) via XCOM-I's--cond=B
command-line switch. Unfortunately, the numbering of theCALL INLINE
statements is a bit different in these two cases, which makes it unwieldy to rely on the generic naming patchN.c for the patch files. My description above is what happens when neither--cond=P
nor--cond=B
is used. But if--cond=P
is used, then XCOM-I actually looks for patch files with names like patchNp.c first, falling back to patchN.c if patchNp.c isn't found. And similarly, if--cond=B
is used, it prefers patch files named patchNb.c, falling back to patchN.c. This method allows the generic names patchN.c to be used up the point at which the numbering of theINLINE
s begins to differ, switching to the patchNp.c/patchNb.c naming thereafter, without any ambiguity or conflicts between the two.
CALL INLINE
statements, we need a single patch file, patch0.c.Now that we actually have a patch file, when XCOM-I again compiles XCOM45, the/*
* This is a C-language "patch" for CALL INLINE #0 in xcom4.xpl, as compiled
* by XCOM-I.py.
*
* The patch spans the entirety of the `SCAN_FINDS` procedure embedded in
* the `SCAN` procedure. This comprises every `CALL INLINE` statement in
* the entire xcom4 program.
*
* What the procedure does is this: It's given the address of a "translation
* table" of one of the following types:
* `BLANKTABLE` (Blanks)
* `ALPHATABLE` (Identifier)
* `STRINGTABLE` (Quoted string)
* `COMMENTABLE` (Comment)
* Starting at position `CP=0` (already set up upon entry so that we know the
* first character is of the asked-for type) in the string `TEXT`, searches
* for the end of the pattern of the selected type, and updates `CP` to the
* length of the found string. It returns 0 on failure and 1 on success. An
* empty string counts as a failure.
*
* This was apparently coded in BAL because the IBM 360 has a translation-table
* instruction, TRT, that does this efficiently. But from my perspective,
* considering that there's no other embedded BAL code in the entire program,
* it seems as though it's a big price to pay, maintenance-wise and
* portability-wise, for a pretty small gain in efficiency. Yes, you can
* always make assembly-language go faster than a high-level language; this is
* news?
*/
static memoryMapEntry_t *mapCP = NULL, *mapTEXT = NULL, *mapTABLE,
*mapBLANKTABLE, *mapALPHATABLE, *mapSTRINGTABLE, *mapCOMMENTABLE;
char *s, *TEXT;
uint32_t TABLE;
if (mapCP == NULL)
{
mapCP = lookupVariable("CP");
mapTEXT = lookupVariable("TEXT");
mapTABLE = lookupVariable("SCANxSCAN_FINDS_END_OFxTABLE");
mapBLANKTABLE = lookupVariable("BLANKTABLE");
mapALPHATABLE = lookupVariable("ALPHATABLE");
mapSTRINGTABLE = lookupVariable("STRINGTABLE");
mapCOMMENTABLE = lookupVariable("COMMENTABLE");
}
TABLE = getFIXED(mapTABLE->address);
TEXT = descriptorToAscii(getCHARACTER(mapTEXT->address));
if (*TEXT == 0)
return fixedToBit(1, 0);
if (TABLE == mapBLANKTABLE->address)
{
for (s = TEXT; *s == ' ' || *s == '\t'; s++);
}
else if (TABLE == mapALPHATABLE->address)
{
for (s = TEXT;
isalnum(*s) || *s == '_' || *s == '@' || *s == '#' || *s == '$';
s++);
}
else if (TABLE == mapSTRINGTABLE->address)
{
for (s = TEXT; *s && *s != '\''; s++);
}
else if (TABLE == mapCOMMENTABLE->address)
{
for (s = TEXT; *s && *s != '*' && *s != '$'; s++);
}
else
abend("Requested translation table for SCAN_FINDS_END_OF not found");
if (*s == 0)
{
putFIXED(mapCP->address, s + 1 - TEXT);
return fixedToBit(1, 0);
}
putFIXED(mapCP->address, s - TEXT);
return fixedToBit(1, 1);
SCAN_FINDS_END_OF
procedure is now translated to the following C code:Notice that the patch is automatically placed within a C/*
File SCANxSCAN_FINDS_END_OF.c generated by XCOM-I, 2024-05-25 14:05:21.
*/
#include "runtimeC.h"
descriptor_t *
SCANxSCAN_FINDS_END_OF(void)
{
{ // (0) CALL INLINE ( 88, 3, 0,TEXT);
/*
* This is a C-language "patch" for CALL INLINE #0 in xcom4.xpl, as compiled
* by XCOM-I.py.
*
* The patch spans the entirety of the `SCAN_FINDS` procedure embedded in
* the `SCAN` procedure. This comprises every `CALL INLINE` statement in
* the entire xcom4 program.
*
* What the procedure does is this: It's given the address of a "translation
* table" of one of the following types:
* `BLANKTABLE` (Blanks)
* `ALPHATABLE` (Identifier)
* `STRINGTABLE` (Quoted string)
* `COMMENTABLE` (Comment)
* Starting at position `CP=0` (already set up upon entry so that we know the
* first character is of the asked-for type) in the string `TEXT`, searches
* for the end of the pattern of the selected type, and updates `CP` to the
* length of the found string. It returns 0 on failure and 1 on success. An
* empty string counts as a failure.
*
* This was apparently coded in BAL because the IBM 360 has a translation-table
* instruction, TRT, that does this efficiently. But from my perspective,
* considering that there's no other embedded BAL code in the entire program,
* it seems as though it's a big price to pay, maintenance-wise and
* portability-wise, for a pretty small gain in efficiency. Yes, you can
* always make assembly-language go faster than a high-level language; this is
* news?
*/
static memoryMapEntry_t *mapCP = NULL, *mapTEXT = NULL, *mapTABLE,
*mapBLANKTABLE, *mapALPHATABLE, *mapSTRINGTABLE, *mapCOMMENTABLE;
char *s, *TEXT;
uint32_t TABLE;
if (mapCP == NULL)
{
mapCP = lookupVariable("CP");
mapTEXT = lookupVariable("TEXT");
mapTABLE = lookupVariable("SCANxSCAN_FINDS_END_OFxTABLE");
mapBLANKTABLE = lookupVariable("BLANKTABLE");
mapALPHATABLE = lookupVariable("ALPHATABLE");
mapSTRINGTABLE = lookupVariable("STRINGTABLE");
mapCOMMENTABLE = lookupVariable("COMMENTABLE");
}
TABLE = getFIXED(mapTABLE->address);
TEXT = descriptorToAscii(getCHARACTER(mapTEXT->address));
if (*TEXT == 0)
return fixedToBit(1, 0);
if (TABLE == mapBLANKTABLE->address)
{
for (s = TEXT; *s == ' ' || *s == '\t'; s++);
}
else if (TABLE == mapALPHATABLE->address)
{
for (s = TEXT;
isalnum(*s) || *s == '_' || *s == '@' || *s == '#' || *s == '$';
s++);
}
else if (TABLE == mapSTRINGTABLE->address)
{
for (s = TEXT; *s && *s != '\''; s++);
}
else if (TABLE == mapCOMMENTABLE->address)
{
for (s = TEXT; *s && *s != '*' && *s != '$'; s++);
}
else
abend("Requested translation table for SCAN_FINDS_END_OF not found");
if (*s == 0)
{
putFIXED(mapCP->address, s + 1 - TEXT);
return fixedToBit(1, 0);
}
putFIXED(mapCP->address, s - TEXT);
return fixedToBit(1, 1);
}
; // (1) CALL INLINE ( 27, 2, 2);
; // (2) CALL INLINE ( 25, 3, 2);
; // (3) CALL INLINE ( 7, 8, 12);
; // (4) CALL INLINE ( 141, 2, 0,0,8);
; // (5) CALL INLINE ( 136, 3, 0,0,8);
; // (6) CALL INLINE ( 88, 1, 0,TABLE);
; // (7) CALL INLINE ( 65, 12, 0,TRT);
; // (8) CALL INLINE ( 68, 2, 0,12,0);
; // (9) CALL INLINE ( 5, 12, 0);
; // (10) CALL INLINE ( 71, 7, 0,12,18);
; // (11) CALL INLINE ( 24, 1, 2);
; // (12) CALL INLINE ( 65, 0, 0,0,1);
; // (13) CALL INLINE ( 26, 1, 0);
; // (14) CALL INLINE ( 27, 3, 3);
; // (15) CALL INLINE ( 71, 15, 0,12,24);
; // (16) CALL INLINE ( 27, 1, 3);
; // (17) CALL INLINE ( 65, 3, 0,0,1);
; // (18) CALL INLINE ( 80, 1, 0,CP);
return 0;
}
{...}
block, which is important if the CALL INLINE(...)
happens to have been preceded by something like IF ... THEN
or ELSE
. Let's start with the executive summary:
In this section, we revisit the problem of inline machine code discussed in the preceding section, describing a tool that may be of assistance in writing patch files for IBM 360 inline code. In brief, if you run XCOM-I with the command-line switch --guess=-1
, it will create a set of files (guess*.c) that are XCOM-I's best guess of what the patch-files (patch*.c) should be like. (You can also use --guess=N1,N2,N3,...
to generate just the files for specific patch numbers, whereas -1 generates every
patch.) You can then manually tweak these "guessed" files with
(hopefully!) much-reduced effort to get patch-files you're satisfied
with. There are three ways in which this guessing facility is
known to fail, and thus to require manual fixes:
That's the summary, which may be everything you actually need to know. Here's the longer explanation.
int32_t GR[16]; // CPU general-purpose registers.
double FR[16]; // CPU floating-point registers.
uint8_t CC; // CPU condition code register. Only the least two bits are significant.
Aside: If unlike me you're actually familiar with IBM 360 programming, I'm sure you'll call the CPU's general-purpose registers R0, R1, R2, and so on. IBM documentation calls them that. It also calls the floating-point registers, which are completely separate and independent of the general-purpose registers, by those very same names: R0, R1, R2, .... Perhaps it's just me, but I found that confusing. And because I was so stupid as to be confused by this apparently obvious and trivial point, it has cost me considerable amount of time. Consequently, I now make a point of calling the general-purpose registers GRn and the floating-point registers FRn. Yes, it's a departure from tradition. No, in this particular case I don't care that I've departed from tradition. A foolish consistency is the hobgoblin of little minds.
Meanwhile, memory can always be accessed and/or converted from IBM 360 formats to native C formats, also via XCOM-I runtime-library features:
uint8_t memory[0x1000000]
— for reading and writing 8-bit bytes from/to memory.int32_t COREWORD(uint32_t address)
and void COREWORD2(uint32_t address, int32_t value)
— for reading and writing 32-bit words from/to memory, respectively.int16_t COREHALFWORD(uint32_t address)
and void COREHALFWORD2(uint32_t address, int16_t value)
— for reading and writing 16-bit half-words from/to memory, respectively.double fromFloatIBM(uint32_t msw, uint32_t lsw)
and void toFloatIBM(uint32_t *msw, uint32_t *lsw, double value)
— for converting pairs of 32-bit words to/from C double-precision floats, respectively.void setCC(void)
and void setCCd(void)
— for adjusting the condition-code register after integer operations or floating-point operations, respectively.Additional globals are provided by the runtime library for holding commonly-needed temporary results:
For floating-point operations, native C formats and arithmetical operators are used. Conversion from/to IBM hexadecimal format occurs only when floating-point values are read from the memory array or written into the memory array. This is why the CPU's floating-point registers are modeled as the Cint32_t address360A, address360B, dsw360, lsw360;
int64_t scratch; // Results of integer operations; input to setCC (see above).
double scratchd; // Results of floating-point operations; input to setCCd (see above).
int64_t dummy360; // Use for anything you like.
double dummy360d; // Use for anything you like. double epsilon360; // A small number, used for tolerance in some floating-point comparisons.
double
type. The alternative would have been to model them as pairs of uint32_t
values, and to provide C functions that directly implemented exact IBM
360 arithmetic (like addition, subtraction, and so on) directly upon
such pairs. I admit that wisdom of my decision is arguable, but
have no present plans to change the implementation.Aside: Actually, there is one case in which this decision causes problems, and that involves the IBM 360 instructions which perform "unnormalized" floating-point arithmetic. Recall that floating-point numbers have two parts, the exponent and the significand. For example, in 3.14156×Let's get the question of IBM 360 calling conventions out of the way. The conventional usage of the IBM 360 CPU's general registers in calling subroutines, as I understand it, is summarized in the bullet list below. But I've added a couple of apparently non-traditional extensions or differences (highlighted in green) that I've inferred from some of the10 , the exponent is 10 and the significand is 3.14156. Almost all floating-point arithmetic on the IBM 360 is in "normalized" form, which means that the exponent and significant are adjusted so as to produce the maximum precision (number of bits) in the significand, given the number of bytes of storage allocated for the number in memory and the nature of the format used for storage. Thus in principle, 3.14156×10 is more precise than 0.03142×12 , and is less precise than 314.15654×8 . In the case of IBM hexadecimal floating-point, the exponent field is 7 bits wide and represents a power of 64 (rather than 10), and is offset by 64. I.e., if the exponent is (say) E, then the powers represented are 64-64 through 64+63. The significand, meanwhile, is 56 bits wide and always positive, since there's a separate sign bit So in principle, an IBM hexadecimal double-precision floating-point number could represent exact positive or negative 56-bit integers. That'll be important in a minute.
Supplementing the instructions for normalized arithmetic, the IBM 360 includes instructions for both unnormalized addition and subtraction as well, which means they perform the indicated operations, but the results are taken as-is, without any normalization. Whereas the choice to use Cdouble
for all arithmetic in the XCOM-I framework means that all arithmetical results have maximum precision all the time.
Doesn't sound like a problem to you? Well, in practiceCALL INLINE
code uses a single unnormalized instruction, namely AW, which adds two floating-point numbers (normalized or unnormalized), producing an unnormalized result. It always does so in precisely the same manner, in order to perform the trick of converting a positive floating-point number (in IBM hexadeximal format) to a binary integer. In assembly language, that trick works as follows:
- Somewhere — in HAL/S-FC, it's in the floating-point working area
DW(6)
andDW(7)
, pointed to by the variableADDR_FIXER
— a representation of the number 0.0 is stored as the pair of words 0x4E000000,0x00000000. (Note that there's no unique representation of 0.0 in IBM hexadecimal floating point, because the exponent is indeterminate when the signifcand is 0, but this particular choice of exponent is important.)- AW is used to perform an unnormalized addition of the
ADDR_FIXER
representation 0.0 with a non-negative floating-point number that represents an integer.- The unnormalized result of the addition has an exponent field that's 0x4E and a significand that's a non-negative 56-bit binary integer. Specifically, the least-significant word is the least-significant 32 bits of the integer.
- The least-significant word, being the desired
FIXED
data, is immediately stored someplace via the IBM 360 instruction STD for whatever further use the program has for that integer.
You probably don't care, but the reason this trick works is that IBM floating-point addition functions by unnormalizing the operand with the smaller exponent to match the exponent of the other operand, and then performing a binary addition of the significands. But no 56-bit integer can have an exponent in IBM hexadecimal format that's larger than 0x4E, so the representation of 0.0 to have an exponent of 0x4E guarantees that it's the other operand that becomes unnormalized, and the way in which it's unnormalized is to be shifted rightward by exactly the right number of bits to turn into a binary representation of the integer.
Regardless of how it works, the reason this causes us a problem is that the native-C arithmetic we perform doesn't track any of this, so the AW is wasted, and doesn't produce an integer usable by the STD instructions that succeed the AW instructions. In other words, after the AW occurs, the
CALL INLINE
for the STD instruction that's supposed to store the 2's-complement integer has no 2's-complement integer to store!
The workaround for this in the XCOM-I runtime library, is that the floating-point registers are supplemented by an additional array that tracks whether the register should be treated as normalized or unnormalized, and then interprets the value stored in that floating-point register accordingly when an STD instruction is performed that stores the value in the register to memory:
double FR[16];
uint8_t unnormFR[16]; // 0 to treat FRn as normalized, 1 as unnormalized.Thus an AW instruction on
FR[n]
setsunnormFR[n]
, while an STD instruction storingFR[n]
resetsunnormFR[n]
.
Yes, it's a bit of a hack, and if you were writing newCALL INLINE
statements you could easily defeat it, but for right now it seems more economical than reworking the entire floating-point arithemetic system for a handful (less than 10) of uses of AW. My advice to you is this: If you're writing new XPL code, don't try to subvert my hack with perverted usage of the AW instruction inCALL INLINE
statements; AW (andCALL INLINE
, for that matter) is perverted enough already, and doesn't need your help.
CALL INLINE
statements I've encountered in the legacy code; take them with a grain of salt!As for how a translation from IBM 360 machine code to C uses the
various concepts described above, consider the very common statement CALL INLINE("58", n, v)
, where n
is a an integer 0-15 and v
is the name of an XPL variable. This inline has IBM 360 opcode
0x58, which has the assembly-language mnemonic L, and means to load
register GR[n]
with the contents of variable v
. Such an inline can be translated into C as:
Or considerGR[n] = COREWORD(ADDR(v));
CALL INLINE("41", n, v)
, which has the mnemonic LA, to store the address of the variable into the register:OrGR[n] = ADDR(v);
CALL INLINE("50", n, v)
, mnemonic ST, to store the value of a register into a variable:In the 19-instruction patch used in the preceding section as an example, 6 of the instructions used are already of one or the other of these three types. Other instructions may have much-more-complex implementations in C than these, of course.COREWORD2(ADDR(v), GR[n]);
CALL INLINE
statements — though unfortunately not all of them — can be handled using these simple maneuvers, XCOM-I has a built-in tool that can create guessed versions of patch-files for you, in which most of the INLINE
s have been translated, but with the ones that were too difficult for XCOM-I to handle left in the draft files for you to manually perfect.--cond=P
or --cond=B
options, plus the --guess=...
switch as I summarized it earlier. When doing so, in addition to the normal compilation outputs, files named guessN.c, guessNp.c, or guessNb.c
are output as well. You can then make manual changes to the
guess-files and rename them as patch-files when you're satisfied.CALL INLINE
s, and thus in principle a total of 81 patch files. In generating guess-files for these programs, XCOM-I flags 11 of those guess-files with the comment ***FIXME***
, which you can easily search for in order to fix the associated issue. Problems not detectable by --guess
, though rare, unfortunately are neither marked nor easy to find or fix.Aside: Regarding the process of manually fixing up the EX instructions not supported by XCOM-INot perfect, of course, but not bad either. By my reckoning, close to 90% of the guessed files seem to be perfect as-is. And of the files which are not perfect, probably around 90% of the--guess
, it's actually not that hard once you understand the trick involved. Admittedly, right now it will seem simpler to me than to you, given our relative positions on this particular learning curve.
An EX instruction is syntactically of the form
where R1 is the number of a CPU general register, and D2(X2,B2) points to an IBM 360 machine code instruction. While it looks confusing, it's always easy to figure out which instruction is targeted, because in the XPL source code it will be set off by itself and there will be no way to reach it in C to execute it directly. Sometimes, the XPL source code will contain a binary representation of that target instruction, which makes the task a lot harder for you, but in the XPL I've seen it will almost always be represented as aEX R1,D2(X2,B2)CALL INLINE
, which makes it much easier. The reason it's easy is because XCOM-I--guess
will have made a guessed patch file for it, and will have printed the assembly-language syntax and the proposed C translation of it for you in the patch file.
To make what I'm saying concrete, consider an example taken from PASS1 of HAL/S-FC. Here's a small excerpt of the XPL source-code file HALINCL/VMEM.xpl:
Those blocks ofELSE DO; 00102100
ADDRTEMP = ADDR(MOVECHAR); 00102200
CALL INLINE("18",0,4); /* LR 0,4 */00102300
CALL INLINE("58",2,0,INTO); /* L 2,INTO */00102400
CALL INLINE("58",3,0,FROM); /* L 3,FROM */00102500
CALL INLINE("48",1,0,LEGNTH); /* LH 1,LEGNTH */00102600
CALL INLINE("06",1,0); /* BCTR 1,0 */00102700
CALL INLINE("58",4,0,ADDRTEMP); /* L 4,ADDRTEMP */00102800
CALL INLINE("44",1,0,4,0); /* EX 1,0(0,4) */00102900
CALL INLINE("18",4,0); /* LR 4,0 */00103000
RETURN; 00103100
END; 00103200
END; 00103300
MOVECHAR: 00103400
CALL INLINE("D2",0,0,2,0,3,0); /* MVC 0(0,2),0(3) */00103500CALL INLINE
statements correspond to patches 25 and 32 respectively. Thus XCOM-I--guess
produces guessed patch files guess25p.c for lines 00102300-00103000 and guess32p.c for line 00103500. The latter is the target instruction for the EX instruction at line 00102900. It's visually obvious (since it's separated!), but you can tell it from reading the code too: Line 00102200 puts the address of the labelMOVECHAR
into the variableADDRTEMP
, while line 00102800 it moves it into the CPU general register GR4, and then at line 00102900 the EX instruction says that the base register GR4 holds the address of the target instruction. Here's how XCOM-I has done its best in guess25p.c to translate the EX instruction,
and in guess32p.c the target instruction,p25_24: ;
// (31) CALL INLINE("44", 2, 0, 3, 0); /* EX 2,0(0,3) */
address360B = GR[3] + 0;
// Type RX, p. 7-74: EX 2,0(0,3)
// ***FIXME*** Unsupported opcode EX
In fact, in my experience, most EX instructions have targeted MVC instructions, as this example does, so this is great for covering the typical case!p32_0: ;
// (32) CALL INLINE("D2", 0, 0, 1, 1, 1, 0); /* MVC 1(0,1),0(1) */
address360A = GR[1] + 1;
address360B = GR[1] + 0;
// Type SS, p. 7-83: MVC 1(0,1),0(1)
mvc(address360A, address360B, 0);
Now, the action of an EX instruction is to replace bits 8-15 of the target machine-code instruction by the contents of the general register the EX instruction specifies, and then to execute that modified target instruction (without actually jumping to it). All you have to figure out is what in the C translation of the target instruction corresponds to bits 8-15 of the machine code. It will usually be something simple. For a target MVC instruction, I can tell you that it's just the final parameter of themvc(...)
function in the C translation. Thus, the change needed to the guessed file guess25p.c, in which the "***FIXME***" is replaced by actually-executable code, is this:
p25_24: ;
// (31) CALL INLINE("44", 2, 0, 3, 0); /* EX 2,0(0,3) */
address360B = GR[3] + 0;
// Type RX, p. 7-74: EX 2,0(0,3)
address360A = GR[1] + 1;
address360B = GR[1] + 0;
// Type SS, p. 7-83: MVC 1(0,1),0(1)
mvc(address360A, address360B, GR[2]);
CALL INLINE
s translated in those files are still perfect.Aside: As an example where XCOM-I fails to produce a perfect guessed result not due to an EX instruction, I can point to the single block ofCALL INLINE
s in the legacy program XCOM45. A number of its instructions involve use of the seemingly-uninitialized register GR12, as well as usage of the register GR3, that make sense (to me!) only if registers GR3 and GR12 are viewed as having a non-traditional conventional role in the subroutine linkage for calling their parent procedure (namely,SCAN_FINDS_END_OF
): i.e., GR3 contains the return value of the procedure (rather than GR0 as documented by IBM), and GR12 contains the size of the procedure's code (other than the linkage code), in bytes.
As another example, I'd point to the one case I know of (in theINTEGER_VALUED
procedure of program HAL/S-FC) in which aRETURN
INLINE
statement is used rather than aCALL INLINE
statement. TheINLINE
XPL built-in is not specified in such a way as to return a value, but the calling procedure (INTEGER_VALUE
) does check the returned value. Moreover, theINLINE
s inINTEGER_VALUED
do in fact manipulate the GR3 register in such a way as to contain either a 0 or a 1, without there being any other evident reason to do so, so it's easy to suppose once again that GR3 is indeed the return value.