(This page and ASM101S itself are under construction.)

Introduction
References
The Master Sequence Controller (MSC)
The Bus Control Elements (BCE)
ASM101S
A Brief Tour of the Assembly Process
Macro-Library Files
Installation of ASM101S
Where Are the AP-101 Assembly-Language Files
Potential Differences from Expectations
Structure of ASM101S, for Maintainers
Afterthought: HLASM

Introduction

ASM101S is our "modern" version of the macro assembler for the assembly language of the IBM AP-101S or AP-101B computers used as "General Purpose Computers" (GPC) in the Space Shuttle. To the best of my knowledge, the original assembler actually used for Shuttle software development has not survived, so ASM101S is an entirely-new creation. It is a Python 3 program that should work on any computer having the Python 3 language installed.

Regarding AP-101B vs AP-101S, the AP-101B was used in earlier Shuttle flights, while the AP-101S was used in later flights. I have almost no documentation for the AP-101B, but believe that insofar as software is concerned, the main difference is that a handful of new instruction types were made available in the AP-101S that hadn't existed in the AP-101B. Therefore, ASM101S itself makes no distinctions between the two languages: If you feed in source code using any of the newer instructions, you'll get object code suitable for the AP-101S only. If you feed in source code not having any of the newer instructions, you'll get object code suitable for both the AP-101S and AP-101B.

Aside: That's not to say that the differences were insignificant, but merely that the differences are transparent to the assembler. Here's a summary of some of the known differences:

Feature
AP-101B
AP-101S

Power:
780W
560W

Weight:
117 pounds
64 pounds

Memory:
104K words
(416K bytes)
256K words
(1024K bytes)

Memory Protection
(per 16-bit half-word):
1 parity bit and
1 store-protect bit
6 ECC bits and
3 store-protect bits

Speed:
420K operations/second
>1000K operations/second

Battery backup:
n/a
Rechargeable NiCAD

Built-in test equipment:
n/a
Temperature; charger; battery; soft error counter

MTBF:
5K hours
24K hours

Feature	AP-101B	AP-101S
Power:	780W	560W
Weight:	117 pounds	64 pounds
Memory:	104K words (416K bytes)	256K words (1024K bytes)
Memory Protection (per 16-bit half-word):	1 parity bit and 1 store-protect bit	6 ECC bits and 3 store-protect bits
Speed:	420K operations/second	>1000K operations/second
Battery backup:	n/a	Rechargeable NiCAD
Built-in test equipment:	n/a	Temperature; charger; battery; soft error counter
MTBF:	5K hours	24K hours

Because there's little software distinction between the different models in a software sense, I'll generally just refer to the "AP-101" rather than specifying "AP-101S" or "AP-101B". Realize, though, that there were also models of the IBM AP-101 computer other than the AP-101B or AP-101S, for uses other than the Shuttle, and there's no particular reason to believe that ASM101S could assemble their source code without some updates.

Despite the lack of surviving documentation about the original AP-101 assembler, by examining the surviving AP-101 assembly-language source code I've concluded that the syntax of the AP-101 assembly language must have closely mimicked that of the standard macro assembler for the IBM System/360 Basic Assembly Language (BAL). This observation has been central to the development of ASM101S.

References

The Virtual AGC HAL/S compiler page
The Virtual AGC AP-101S linker page
"Space Shuttle Model AP-101S Principles of Operations with Shuttle Instruction Set". Manual for the CPU instruction set. Also known as the "POO" (Principles of Operation).
"OS Assembler Language - OS Release 21". Manual for the System/360 macro assembler upon which ASM101S is modeled.
A collection of notes about the AP-101S/AP-101B.
"AP 101S Assembly Language Introduction".
"Space Shuttle Advanced System/4 Pi Input/Output Processor (IOP)":

Parts I and II: : Principles of Operation for PCI/PCO and MSC
Part III: Principles of Operation for BCE.

"Space Shuttle Advanced System/4 Pi Model AP-101 Central Processor Unit Technical Description". Note that this predates the AP-101S model and doesn't precisely match it.

The Master Sequence Controller (MSC)

AP-101 assembly language includes not only pseudo-ops and AP-101 CPU instructions as such, but also instructions for the Master Sequence Controller (MSC). You might categorize the MSC as being a separate computer within the main computer. The CPU and MSC instructions, for these two essentially separate types of computers, are intermixed indiscriminately in AP-101 assembly-language source code and sharing memory. The POO explains the MSC thusly:

"The Master Sequence Controller (MSC) is a micro programmed computer specifically tailored for I/O Management within the Space Shuttle General Purpose Computer (GPC). As such, it has extensive and programmable capabilities for monitoring and controlling the basic I/O operations performed by upwards to 24 Bus Control Elements (BCE's) which are implemented in the baseline GPC. These capabilities include setting up, scheduling, and initiating BCE programs, monitoring the status of BCD operations, and communicating overall completion of these operations to the CPU."

MSC instructions can be distinguished from CPU instructions in that they begin with the character "@".

See Appendix II of the AP-101S POO or the seemingly-identical Part II of the IOP POO for more information.

The Bus Control Elements (BCE)

Similarly, the POO tells us that

"The Bus Control Element (BCE) is a microprogrammed controller specifically tailored for management of I/O traffic on one of the Space Shuttle system busses. Within each IOP [Input/Output Processor] there is one BCE for each system bus, for a total of 24 BCE's. Each of these BCE's is capable of independent program execution, data buffering to and from memory, and communication with the MSC."

Or in other words, besides the MSC discussed in the preceding section, there are 24 additional processors within the AP-101, yet again with their own distinct instruction set, yet again sharing memory and intermixed in the assembly-language source code with CPU instructions and MSC instructions.

BCE instructions can be distinguished from CPU and MSC instructions in that they begin with the character "#".

See Appendix III of the AP-101S POO or the seemingly-identical Part III of the IOP POO for more information.

ASM101S

(To be clear, ASM101S is not entirely functional. I am simply documenting it as I proceed with development. Therefore, some of what's described might not be available yet, though of course where I show specific examples of output from the assembler, it is functional enough to produce the results shown.)

The modern assembler is invoked as

ASM101S.py [OPTIONS] SOURCE1.asm SOURCE2.asm ...

Some of the more-significant available OPTIONS are:

--library=LIBRARY specifies a path to a folder containing files of macro definitions needed for processing the source-code files. You'll see more on this topic later on. Though "optional", there are very few (if any) legacy assembly-language programs that you'll be able to assemble without it.
--sysparm=BFS tells the assembler that the a portion of the Space Shuttle's "backup flight software" is being assembled. By default, the assembler assumes instead that you're assembly a portion of the primary flight software. If you're assembling non-Shuttle software, it doesn't really matter which is which. Note that the "BFS" here is literal.
--object=OBJECT.obj specifies the name of the object-code file generated by the assembler. This this option isn't present, OBJECT will match the name of the last source-code file specified on the command line.
--compare=FILENAME specifies the name of an existing assembly-listing file against which you wish to compare the results of the current assembly. Byte-by-byte comparison of the generated object code is performed, and mismatches are noted directly in the assembly-listing produced by the assembler. This feature is most-useful when contemporary assembly listings are available, as is the case (for example) for all of the modules in the AP-101S runtime library of the HAL/S compiler (HAL/S-FC).

Since you are allowed to specify multiple assembly-language source-code files on the command lines, it's possible in principle to assemble an entire program at once. In practice, it would be more usual to assemble each source-code file separately, and then to use a linker to merge all of those object-code files into a single AP-101S load file for the program as a whole.

You can see a list (and explanations!) of all of the available command-line OPTIONS by using the command

ASM101S.py --help

The action of the assembler is roughly the following:

(Optionally.) Loads the macro definitions found in the assembly-language source-code files (*.asm) found in the LIBRARY folder. (More on this is in the next section.)
Load all of the SOURCEx.asm files specified on the command line, in the order that they are specified. Macro definitions should precede any source-code files using those macros, but the ordering of macro files among themselves is not significant.
Process the entire set of source code which has been loaded.
Output an assembly report (on stdout) and if there were no fatal errors, an AP-101 object-code file that contains the results of the assembly process.

Recognize that at least as far as Space Shuttle flight software is concerned, the full flight software consists of a large number of assembly language modules, along with even more modules written in the high-level language HAL/S. So as noted above in a slightly-different way, the object file emitted by the assembler is unlikely to be a full executable program — or load file in this particular jargon — but rather just one of many object files which need to be linked by an AP-101 linker utility before becoming an executable program.

Beyond the assembler itself, there is also a bash shell script — useful in Linux or Mac OS, but usually not in Windows since the bash command shell isn't commonly available in Windows without extra effort on your part — that performs a kind of regression test on ASM101S by assembling legacy AP-101S assembly-language source-code files for which a contemporary assembly listing made by the original AP-101S assembler has survived, and performing a byte-by-byte comparison of ASM101S's assemblies vs the original assemblies. It is run simply as

regressionASM101S.sh

At this writing, 205 source-code files are included in the regression test, comprising around 17K lines of source code. Yes, that's a relatively small test, but I'm not responsible for the lack of surviving test material. Unfortunately, I cannot claim that ASM101S is a fast program, so this process is not as fast as I might hope, nor are status messages provided as long as the tests are successful, so you may find yourself confused that nothing appears to be happening. If you run the test instead as

regressionASM101S.sh -v

it will at least display the names of the files it is checking.

A Brief Tour of the Assembly Process

As a more-concrete example than the abstract description in the preceding section, consider assembly of the "ACOS" module from the AP-101S runtime library of the HAL/S compiler. We'll learn more details about the runtime library and what's required for assembling it in the sections that follow. For our purposes at the moment, I'll content myself by saying that if you follow the installation instructions (given later) you can perform this assembly with these steps:

cd "virtualagc/yaShuttle/Source Code/PASS.REL32V0/RUNASM"
ASM101S.py --library=../RUNMAC ACOS.asm

(Or at least, that's how you'd do it in Linux or Mac OS. In Windows you'd have to use '\' in place of '/'.)

Here's what the AP-101S assembly-language source code for the ACOS module looks like prior to assembly:

 TITLE 'ACOS -- SINGLE PRECISION INVERSE SINE-COSINE FUNCTION'          00000100
*
* WHEN DATA_REMOTE IS IN EFFECT, THE LDM INSTRUCTIONS ARE NEEDED
* AROUND THE CALL TO THIS RTL ROUTINE BECAUSE R1/R3 IS USED AS
* A BASE REGISTER. (CR12620)
*
*  ACOS:  INVERSE SINE-COSINE(SINGLE)                                   00000200
*                                                                       00000300
*        1. INPUT AND OUTPUT VIA F0.                                    00000400
*        2. FOR 0<=X<=1/2, COMPUTE ASIN(X) BY A RATIONAL FUNCTION.      00000500
*        3. FOR 1/2<X<=1, USE ASIN(X)=PI/2-2*ASIN(SQRT((1-X)/2)).       00000600
*        4. FOR X<0, USE ASIN(X)=-ASIN(-X).                             00000700
*        5. ACOS(X)=PI/2-ASIN(X).                                       00000800
*        6. ERROR GIVEN IF |X|>1.                                       00000900
*                                                                       00001000
*IBM CR #SS8244- RETURN PI FOR ARCCOS ARGUMENT < -1; 8/14/81   A.CLOSE  00001010
* REGISTER R4 IS USED AS A SWITCH-                                      00001100
*   BIT 0 IS USED TO SIGNAL WHETHER THIS PROCEDUER IS                   00001200
*   CALLED AS ACOS(ON) OR ASIN(OFF). BIT 1 IS USED TO SIGNAL            00001300
*   IF ARG. IS POS.(OFF) OR NEG.(ON).BIT 2 IS USED TO SIGNAL            00001400
*   ABS. VALUE OF ARG.,IF GREATER THAN 1/2(ON) ELSE (OFF)               00001500
*                                                                       00001600
         MACRO                                                          00001700
         WORKAREA                                                       00001800
SAVE6    DS    D              TO SAVE REGISTERS F6,F7                   00001900
SWITCH   DS    F              TO SAVE R4 ACROSS INTRINSIC CALL          00002000
         MEND                                                           00002100
ACOS     AMAIN ACALL=YES                                                00002200
*COMPUTES ARC-COSINE(X) OF SINGLE PRECISION SCALAR                      00002300
         INPUT F0             SCALAR SP                                 00002400
         OUTPUT F0            SCALAR SP RADIANS                         00002500
         WORK  R1,R2,R3,R4,R5,R6,R7,F1,F2,F3,F4,F6                      00002600
         SR    R4,R4          SIGNAL ACOS ENTRY                         00002700
         OHI   R4,X'8000'                                               00002800
         B     MERGE                                                    00002900
ASIN     AENTRY                                                         00003000
* COMPUTES ARC-SINE(X) OF SINGLE PRECISION SCALAR                       00003100
         INPUT F0             SCALAR SP                                 00003200
         OUTPUT F0            SCALAR SP RADIANS                         00003300
         WORK  R2,R3,R4,F2,F3,F4,F6                                     00003400
         SR    R4,R4          SIGNAL ASIN ENTRY                         00003500
*                                                                       00003600
MERGE    STED  F6,SAVE6       SAVE REGISTERS F6,F7                      00003700
         LER   F6,F0          GET ARGUMENT IN F6                        00003800
         BNM   POS            AND TEST SIGN                             00003900
         OHI   R4,X'4000'     SIGNAL NEG. ARG.                          00004000
         LECR  F0,F0          GET |X|                                   00004100
*                                                                       00004200
POS      CE    F0,HALF        SKIP TO 'SMALL' IF                        00004300
         BNH   SMALL          |X|<=1/2                                  00004400
*                                                                       00004500
*  Z=SQRT((1-|X|)/2) IF |X|^>|1/2                                       00004600
*                                                                       00004700
         OHI   R4,X'2000'     SIGNAL |X|>1/2                            00004800
         LECR  F0,F0                                                    00004900
         AE    F0,ONE         1-|X| IN F0                               00005000
         BNP   EXPRESS        BRANCH IF |X|>=1                          00005100
*                                                                       00005200
         LER   F6,F0                                                    00005300
         ME    F6,HALF        Z**2 IN F6                                00005400
         AER   F0,F0          4*Z**2 IN F0                              00005500
*                                                                       00005600
         ST    R4,SWITCH      SAVE R4 ACROSS INTRINSIC CALL             00005700
         ABAL  SQRT           GET 2Z IN F0 BY CALL TO                   00005800
         L     R4,SWITCH      RESTORE R4                                00005900
         B     POLY           SQRT, AND RETURN TO POLY                  00006000
*                                                                       00006100
SMALL    CE    F0,UNFLO                                                 00006200
         BNH   TESTS                                                    00006300
         MER   F6,F6          X**2 IN F6                                00006400
*                                                                       00006500
* COMPUTE ASIN(Z) HERE BY RATIONAL FUNCTION                             00006600
*                                                                       00006700
POLY     LE    F4,C2                                                    00006800
         AER   F4,F6                                                    00006900
         LE    F2,D2                                                    00007000
         DER   F2,F4          (C2+Z**2)/D2                              00007100
         AE    F2,C1                                                    00007200
         AER   F2,F6          C1+Z**2+(C2+Z**2)/D2                      00007300
         ME    F6,D1                                                    00007400
         DER   F6,F2                                                    00007500
         LE    F3,ROUND       ROUNDING NUMBER IN F3,                    00007600
         LER   F2,F0          AS LOW HALF OF 2Z.                        00007700
         MER   F0,F6                                                    00007800
         AEDR  F0,F2          ROUNDING OCCURS HERE                      00007900
*                                                                       00008000
*  REVERSE REDUCTIONS TO GIVE ACTUAL ANSWER                             00008100
*                                                                       00008200
TESTS    TRB    R4,X'A000'  IF ACOS ENTRY                               00008300
         BNM   TSTNEG         OR |X|>1/2 BUT NOT BOTH,                  00008400
         LECR  F0,F0          SUBTRACT RESULT                           00008500
         AED   F0,PIOV2       FROM PI/2                                 00008600
*                                                                       00008700
TSTNEG   TRB    R4,X'4000'  COMPLEMENT                                  00008800
         BNO   EXIT           IF ARGUMENT WAS NEGATIVE.                 00008900
         LER   F0,F0          CHECK FOR ARG=0(HARDWARE BUG)             00009000
         BZ    NOLECR                                                   00009100
         LECR  F0,F0                                                    00009200
NOLECR   TRB   R4,X'C000'     AND ADD PI TO                             00009300
         BNO   EXIT           RESULT IF ACOS ENTRY                      00009400
         AE    F0,PI          AND ARGUMENT<0.                           00009500
*                                                                       00009600
EXIT     LED   F6,SAVE6       RESTORE F6,F7                             00009700
         AEXIT                AND EXIT                                  00009800
*                                                                       00009900
EXPRESS  BZ    TESTS          SKIP POLYNOMIAL IF |X|=1                  00010000
         AERROR 10            |ARG|>1                                   00010100
         LE    F0,PIOV2                                                 00010200
         TRB   R4,X'8000'     ASIN CALL?                                00010300
         BZ    TSTNEG         YES, CHECK SIGN                           00010400
         TRB   R4,X'4000'     X=-1?                                     00010500
         BZ    RET0           NO,RETURN 0                               00010501
         LE    F0,PI          YES, RETURN PI                            00010502
         B     EXIT                                                     00010503
RET0     SER   F0,F0                                                    00010504
         B     EXIT                                                     00010600
*                                                                       00010700
         DS    0F                                                       00010800
PI       DC    X'413243F7'    PI                                        00010900
PIOV2    DC    X'411921FB'    PI/2                                      00011000
ROUND    DC    X'5FFFFFFF'                                              00011100
UNFLO    DC    X'3E100000'     16**(-3)                                 00011200
ONE      DC    X'41100000'    1.0                                       00011300
HALF     DC    X'40800000'    0.5                                       00011400
D1       DC    X'C08143C7'    -0.5049404                                00011500
C1       DC    X'C13B446A'    -3.7042025                                00011600
D2       DC    X'C11406BF'    -1.2516474                                00011700
C2       DC    X'C11DB034'    -1.8555182                                00011800
         ACLOSE                                                         00011900

Simple, right? Well ... perhaps not so much! In fact, it's more complex than it appears at first sight. You'll notice that strewn throughout the source code are various "instructions" that I've highlighted in red to make them stand out. These are actually invocations of "macros" (more on this below), each of which may be expanded into multiple lines of source code at assembly-time. Or perhaps not: The INPUT macro, for example, in this example turns out to "expand" to nothing at all. In contrast, other of the macros may themselves invoke other macros, which may in turn invoke other macros, and so on. For example, the ACLOSE macro appearing at the bottom of the listing above invokes the macro ERRPARMS. The macros themselves (in this example) provide the machinery needed to interface assembly-language subroutines to HAL/S code calling those subroutines, but macros can serve many other purposes in other source-code files.

Not that we're in a position to understand it fully without a lot of study, here's what the source code for that very same macro ACLOSE looks like:

         MACRO                                                          00000100
         ACLOSE                                                         00000200
         GBLA  &ENTCNT                                                  00000300
         GBLB  &INPUT(20),&OUTPUT(20)                                   00000400
         GBLC  &NAMES(20)                                               00000500
         ERRPARMS                                                       00000600
&I       SETA  1                                                        00000700
.LOOP    AIF   (&INPUT(&I)).INOK                                        00000800
         MNOTE 1,'INPUT NOT SPECIFIED FOR &NAMES(&I)'                   00000900
.INOK    AIF   (&OUTPUT(&I)).OUTOK                                      00001000
         MNOTE 1,'OUTPUT NOT SPECIFIED FOR &NAMES(&I)'                  00001100
.OUTOK   ANOP                                                           00001200
&I       SETA  &I+1                                                     00001300
         AIF   (&I LE &ENTCNT).LOOP                                     00001400
         END                                                            00001500
         MEND                                                           00001600

I've highlighted this entirely in red because even though many macro definitions include actual AP-101S instructions and pseudo-ops, this particular macro definition consists entirely of statements in the macro language. About the only thing here that's immediately understandable — and only because I mentioned it earlier! — is that ERRPARMS is the invocation of another macro.

Now that we recognize that the assembly listing of ACOS may not look too much like the original source code, the following is an excerpt from the assembly report produced when ASM101S assembles ACOS. In the interest of saving a little space, I've removed such assembler-generated items as the symbol table. At the same time, I've highlighted the macro expansions in red:

         ACOS -- SINGLE PRECISION INVERSE SINE-COSINE FUNCTION                                                 PAGE    2
  LOC  OBJECT CODE   ADR1 ADR2      SOURCE STATEMENT                                               ASM101S 0.00 10/20/24
                                  2 *                                                                               
                                  3 * WHEN DATA_REMOTE IS IN EFFECT, THE LDM INSTRUCTIONS ARE NEEDED                
                                  4 * AROUND THE CALL TO THIS RTL ROUTINE BECAUSE R1/R3 IS USED AS                  
                                  5 * A BASE REGISTER. (CR12620)                                                    
                                  6 *                                                                               
                                  7 *  ACOS:  INVERSE SINE-COSINE(SINGLE)                                   00000200
                                  8 *                                                                       00000300
                                  9 *        1. INPUT AND OUTPUT VIA F0.                                    00000400
                                 10 *        2. FOR 0<=X<=1/2, COMPUTE ASIN(X) BY A RATIONAL FUNCTION.      00000500
                                 11 *        3. FOR 1/2<X<=1, USE ASIN(X)=PI/2-2*ASIN(SQRT((1-X)/2)).       00000600
                                 12 *        4. FOR X<0, USE ASIN(X)=-ASIN(-X).                             00000700
                                 13 *        5. ACOS(X)=PI/2-ASIN(X).                                       00000800
                                 14 *        6. ERROR GIVEN IF |X|>1.                                       00000900
                                 15 *                                                                       00001000
                                 16 *IBM CR #SS8244- RETURN PI FOR ARCCOS ARGUMENT < -1; 8/14/81   A.CLOSE  00001010
                                 17 * REGISTER R4 IS USED AS A SWITCH-                                      00001100
                                 18 *   BIT 0 IS USED TO SIGNAL WHETHER THIS PROCEDUER IS                   00001200
                                 19 *   CALLED AS ACOS(ON) OR ASIN(OFF). BIT 1 IS USED TO SIGNAL            00001300
                                 20 *   IF ARG. IS POS.(OFF) OR NEG.(ON).BIT 2 IS USED TO SIGNAL            00001400
                                 21 *   ABS. VALUE OF ARG.,IF GREATER THAN 1/2(ON) ELSE (OFF)               00001500
                                 22 *                                                                       00001600
                                 23          MACRO                                                          00001700
                                 24          WORKAREA                                                       00001800
                                 25 SAVE6    DS    D              TO SAVE REGISTERS F6,F7                   00001900
                                 26 SWITCH   DS    F              TO SAVE R4 ACROSS INTRINSIC CALL          00002000
                                 27          MEND                                                           00002100
                                 28 ACOS     AMAIN ACALL=YES                                                00002200
                                 29+*********************************************************************** 01-AMAIN
                                 30+*                                                                       01-AMAIN
                                 31+*        PRIMARY ENTRY POINT                                            01-AMAIN
                                 32+*                                                                       01-AMAIN
                                 33+*********************************************************************** 01-AMAIN
00000                            34+ACOS     CSECT                                                          01-AMAIN
00000                            35+STACK    DSECT                                                          01-AMAIN
                                 36+*        DS    18H            STANDARD STACK AREA DEFINITION            01-AMAIN
00000                            37+         DS    F              PSW (LEFT HALF)                           01-AMAIN
00002                            38+         DS    2F             R0,R1                                     01-AMAIN
00006                            39+ARG2     DS    F              R2                                        01-AMAIN
00008                            40+         DS    F              R3                                        01-AMAIN
0000A                            41+ARG4     DS    F              R4                                        01-AMAIN
0000C                            42+ARG5     DS    F              R5                                        01-AMAIN
0000E                            43+ARG6     DS    F              R6                                        01-AMAIN
00010                            44+ARG7     DS    F              R7                                        01-AMAIN
                                 45+*        END OF STANDARD STACK AREA                                     01-AMAIN
00012                            46+SAVE6    DS    D              TO SAVE REGISTERS F6,F7                   02-WORKAREA
00016                            47+SWITCH   DS    F              TO SAVE R4 ACROSS INTRINSIC CALL          02-WORKAREA
00018                            48+STACKEND DS    0F             END OF COMBINED STACK AREA                01-AMAIN
00000                            49+ACOS     CSECT                                                          01-AMAIN
                                 50+         USING STACK,0        ADDRESS STACK AREA                        01-AMAIN
00000 E0FB 0018      0018        51+         IAL   0,STACKEND-STACK SET STACK SIZE                          01-AMAIN
00002 B624 0000      0009 0000   52+         NIST  9(0),0         CLEAR ON ERROR INFO (LCL DATA PTR)        01-AMAIN
                                 54 *COMPUTES ARC-COSINE(X) OF SINGLE PRECISION SCALAR                      00002300
                                 55          INPUT F0                                                       00002400
                                 56+F0       EQU   0                                                        01-INPUT
                                 58          OUTPUT F0                                                      00002500
                                 60          WORK  R1,R2,R3,R4,R5,R6,R7,F1,F2,F3,F4,F6                      00002600
                                 61+R1       EQU   1                                                        01-WORK
                                 62+R2       EQU   2                                                        01-WORK
                                 63+R3       EQU   3                                                        01-WORK
                                 64+R4       EQU   4                                                        01-WORK
                                 65+R5       EQU   5                                                        01-WORK
                                 66+R6       EQU   6                                                        01-WORK
                                 67+R7       EQU   7                                                        01-WORK
                                 68+F1       EQU   1                                                        01-WORK
                                 69+F2       EQU   2                                                        01-WORK
                                 70+F3       EQU   3                                                        01-WORK
                                 71+F4       EQU   4                                                        01-WORK
                                 72+****** WARNING: F6 MUST BE PRESERVED ACROSS CALLS                       01-WORK
                                 73+F6       EQU   6                                                        01-WORK
00004 0CE4                       75          SR    R4,R4          SIGNAL ACOS ENTRY                         00002700
00005 B2E4 8000           8000   76          OHI   R4,X'8000'                                               00002800
00007 DF14           000D 0005   77          B     MERGE                                                    00002900
                                 78 ASIN     AENTRY                                                         00003000
                                 79+*********************************************************************** 01-AENTRY
                                 80+*                                                                       01-AENTRY
                                 81+*        SECONDARY ENTRY POINT                                          01-AENTRY
------------------------------------------------------------------------------------------------------------------------
         ACOS -- SINGLE PRECISION INVERSE SINE-COSINE FUNCTION                                                 PAGE    3
  LOC  OBJECT CODE   ADR1 ADR2      SOURCE STATEMENT                                               ASM101S 0.00 10/20/24
                                 82+*                                                                       01-AENTRY
                                 83+*********************************************************************** 01-AENTRY
00008                            84+ASIN     DS    0H                                                       01-AENTRY
                                 85+         ENTRY ASIN                                                     01-AENTRY
00008 E0FB 0018      0018        86+         IAL   0,STACKEND-STACK SET STACK SIZE                          01-AENTRY
0000A B624 0000      0009 0000   87+         NIST  9(0),0         CLEAR ERROR VECTOR POINTER                01-AENTRY
                                 89 * COMPUTES ARC-SINE(X) OF SINGLE PRECISION SCALAR                       00003100
                                 90          INPUT F0                                                       00003200
                                 92          OUTPUT F0                                                      00003300
                                 94          WORK  R2,R3,R4,F2,F3,F4,F6                                     00003400
                                 95+****** WARNING: F6 MUST BE PRESERVED ACROSS CALLS                       01-WORK
0000C 0CE4                       97          SR    R4,R4          SIGNAL ASIN ENTRY                         00003500
                                 98 *                                                                       00003600
0000D 3EF8 0012      0012        99 MERGE    STED  F6,SAVE6       SAVE REGISTERS F6,F7                      00003700
0000F 7EE0                      100          LER   F6,F0          GET ARGUMENT IN F6                        00003800
00010 DD0C           0014 0003  101          BNM   POS            AND TEST SIGN                             00003900
00011 B2E4 4000           4000  102          OHI   R4,X'4000'     SIGNAL NEG. ARG.                          00004000
00013 78E8                      103          LECR  F0,F0          GET |X|                                   00004100
                                104 *                                                                       00004200
00014 48FF 0054      006A 0054  105 POS      CE    F0,HALF        SKIP TO 'SMALL' IF                        00004300
00016 DE3C           0026 000F  106          BNH   SMALL          |X|<=1/2                                  00004400
                                107 *                                                                       00004500
                                108 *  Z=SQRT((1-|X|)/2) IF |X|^>|1/2                                       00004600
                                109 *                                                                       00004700
00017 B2E4 2000           2000  110          OHI   R4,X'2000'     SIGNAL |X|>1/2                            00004800
00019 78E8                      111          LECR  F0,F0                                                    00004900
0001A 50F7 004C      0068 004C  112          AE    F0,ONE         1-|X| IN F0                               00005000
0001C DEC8           004F 0032  113          BNP   EXPRESS        BRANCH IF |X|>=1                          00005100
                                114 *                                                                       00005200
0001D 7EE0                      115          LER   F6,F0                                                    00005300
0001E 66F7 004A      006A 004A  116          ME    F6,HALF        Z**2 IN F6                                00005400
00020 50E0                      117          AER   F0,F0          4*Z**2 IN F0                              00005500
                                118 *                                                                       00005600
00021 342C           0016       119          ST    R4,SWITCH      SAVE R4 ACROSS INTRINSIC CALL             00005700
                                120          ABAL  SQRT                                                     00005800
                                121+         EXTRN #QSQRT                                                   01-ABAL
00022 E4F7                      122+         DC    X'E4F7'          CALL INTRINSIC INDIRECTLY               01-ABAL
00023 3800                      123+         DC    Y(#QSQRT+X'3800')  THROUGH QCON                          01-ABAL
00024 1C2C           0016       124          L     R4,SWITCH      RESTORE R4                                00005900
00025 DF10           002A 0004  125          B     POLY           SQRT, AND RETURN TO POLY                  00006000
                                126 *                                                                       00006100
00026 48FF 003E      0066 003E  127 SMALL    CE    F0,UNFLO                                                 00006200
00028 DE48           003B 0012  128          BNH   TESTS                                                    00006300
00029 66E6                      129          MER   F6,F6          X**2 IN F6                                00006400
                                130 *                                                                       00006500
                                131 * COMPUTE ASIN(Z) HERE BY RATIONAL FUNCTION                             00006600
                                132 *                                                                       00006700
0002A 7CF7 0046      0072 0046  133 POLY     LE    F4,C2                                                    00006800
0002C 54E6                      134          AER   F4,F6                                                    00006900
0002D 7AF7 0041      0070 0041  135          LE    F2,D2                                                    00007000
0002F 6AE4                      136          DER   F2,F4          (C2+Z**2)/D2                              00007100
00030 52F7 003C      006E 003C  137          AE    F2,C1                                                    00007200
00032 52E6                      138          AER   F2,F6          C1+Z**2+(C2+Z**2)/D2                      00007300
00033 66F7 0037      006C 0037  139          ME    F6,D1                                                    00007400
00035 6EE2                      140          DER   F6,F2                                                    00007500
00036 7BF7 002C      0064 002C  141          LE    F3,ROUND       ROUNDING NUMBER IN F3,                    00007600
00038 7AE0                      142          LER   F2,F0          AS LOW HALF OF 2Z.                        00007700
00039 60E6                      143          MER   F0,F6                                                    00007800
0003A 50EA                      144          AEDR  F0,F2          ROUNDING OCCURS HERE                      00007900
                                145 *                                                                       00008000
                                146 *  REVERSE REDUCTIONS TO GIVE ACTUAL ANSWER                             00008100
                                147 *                                                                       00008200
0003B B3E4 A000           A000  148 TESTS    TRB   R4,X'A000'  IF ACOS ENTRY                                00008300
0003D DD0C           0041 0003  149          BNM   TSTNEG         OR |X|>1/2 BUT NOT BOTH,                  00008400
0003E 78E8                      150          LECR  F0,F0          SUBTRACT RESULT                           00008500
0003F 50FF 0021      0062 0021  151          AED   F0,PIOV2       FROM PI/2                                 00008600
                                152 *                                                                       00008700
00041 B3E4 4000           4000  153 TSTNEG   TRB   R4,X'4000'  COMPLEMENT                                   00008800
00043 DE20           004C 0008  154          BNO   EXIT           IF ARGUMENT WAS NEGATIVE.                 00008900
00044 78E0                      155          LER   F0,F0          CHECK FOR ARG=0(HARDWARE BUG)             00009000
00045 DC04           0047 0001  156          BZ    NOLECR                                                   00009100
00046 78E8                      157          LECR  F0,F0                                                    00009200
00047 B3E4 C000           C000  158 NOLECR   TRB   R4,X'C000'     AND ADD PI TO                             00009300
00049 DE08           004C 0002  159          BNO   EXIT           RESULT IF ACOS ENTRY                      00009400
0004A 50F7 0014      0060 0014  160          AE    F0,PI          AND ARGUMENT<0.                           00009500
                                161 *                                                                       00009600
------------------------------------------------------------------------------------------------------------------------
         ACOS -- SINGLE PRECISION INVERSE SINE-COSINE FUNCTION                                                 PAGE    4
  LOC  OBJECT CODE   ADR1 ADR2      SOURCE STATEMENT                                               ASM101S 0.00 10/20/24
0004C 7EF8 0012      0012       162 EXIT     LED   F6,SAVE6       RESTORE F6,F7                             00009700
                                163          AEXIT AND                                                      00009800
                                164+*********RETURN TO CALLER********************************************** 01-AEXIT
0004E                           165+         DS    0H                                                       01-AEXIT
0004E 97E8                      166+$RET1    SRET  7,0        RETURN TO CALLER                              01-AEXIT
                                167+*********************************************************************** 01-AEXIT
                                169 *                                                                       00009900
0004F DC56           003B 0015  170 EXPRESS  BZ    TESTS          SKIP POLYNOMIAL IF |X|=1                  00010000
                                171          AERROR 10                                                      00010100
                                172+*********ISSUE SEND ERROR SVC****************************************** 01-AERROR
00050 C9FB 0074      0074       173+         SVC   AERROR1       ISSUE SEND ERROR SVC                       01-AERROR
                                174+*********SEND ERROR SVC RETURNS CONTROL FOR STANDARD FIXUP************* 01-AERROR
00052 78F7 000E      0062 000E  175          LE    F0,PIOV2                                                 00010200
00054 B3E4 8000           8000  176          TRB   R4,X'8000'     ASIN CALL?                                00010300
00056 DC5A           0041 0016  177          BZ    TSTNEG         YES, CHECK SIGN                           00010400
00057 B3E4 4000           4000  178          TRB   R4,X'4000'     X=-1?                                     00010500
00059 DC0C           005D 0003  179          BZ    RET0           NO,RETURN 0                               00010501
0005A 78F7 0004      0060 0004  180          LE    F0,PI          YES, RETURN PI                            00010502
0005C DF46           004C 0011  181          B     EXIT                                                     00010503
0005D 58E0                      182 RET0     SER   F0,F0                                                    00010504
0005E DF4E           004C 0013  183          B     EXIT                                                     00010600
                                184 *                                                                       00010700
00060                           185          DS    0F                                                       00010800
00060 413243F7                  186 PI       DC    X'413243F7'    PI                                        00010900
00062 411921FB                  187 PIOV2    DC    X'411921FB'    PI/2                                      00011000
00064 5FFFFFFF                  188 ROUND    DC    X'5FFFFFFF'                                              00011100
00066 3E100000                  189 UNFLO    DC    X'3E100000'     16**(-3)                                 00011200
00068 41100000                  190 ONE      DC    X'41100000'    1.0                                       00011300
0006A 40800000                  191 HALF     DC    X'40800000'    0.5                                       00011400
0006C C08143C7                  192 D1       DC    X'C08143C7'    -0.5049404                                00011500
0006E C13B446A                  193 C1       DC    X'C13B446A'    -3.7042025                                00011600
00070 C11406BF                  194 D2       DC    X'C11406BF'    -1.2516474                                00011700
00072 C11DB034                  195 C2       DC    X'C11DB034'    -1.8555182                                00011800
                                196          ACLOSE                                                         00011900
                                197+         LTORG                                                          02-ERRPARMS
                                198+****************ERROR PARAMETER AREA*********************************** 02-ERRPARMS
00074                           199+#LACOS   CSECT                                                          02-ERRPARMS
                                200+****  ACOS SENDS THE FOLLOWING ERROR                                    02-ERRPARMS
                                202+****  ERROR NUMBER 10 IN GROUP 4                                        02-ERRPARMS
00074 0014                      204+AERROR1  DC    H'20'          SVC CODE FOR SEND ERROR                   02-ERRPARMS
00075 040A                      205+         DC    Y(4*256+10) 8 BIT GROUP AND NUMBER                       02-ERRPARMS
                                206+****************END OF ERROR PARAMETER AREA**************************** 02-ERRPARMS
                                207+         END                                                            01-ACLOSE

As I hinted earlier, one nice feature (for us!) of the HAL/S AP-101S runtime library is that we have contemporary assembly-listings created by the original assembler, which is of great help in verifying that ASM101S produces correct results. ASM101S tries to mimic the assembly listings produced by that original assembler, but not obsessively so. If you're interested — and if you have a really big display or multiple monitors! — clicking this link hopefully opens up the original assembly listing in a new window or tab (depending on your browser's configuration), so that you can visually compare the contemporary and new listings side by side.

Macro-Library Folders

AP-101 assembly language is a macro language. The Shuttle software developers made constant and frequent use of these macro capabilities, to the point where it's difficult even to find an AP-101 assembly-language file that is not dependent on macros. Thus, we have to understand various things about this macro capability from the very beginning, rather than concentrating merely on translation of AP-101 instructions into object code and leaving the topic of macros for future consideration. You can read about it for yourself in the IBM 360 assembler language manual, in which the discussion of macros fully occupies the entire 2nd half of that document. Learning the syntax and other usage details of macros is as important for understanding the flight software, or perhaps even more important, and likely more difficult, than understanding the AP-101 instruction set.

But I won't talk here about the technical details of macro definitions or expansions. Rather, here I merely need to talk about the organizational principles of the associated source-code files.

When assembling an assembly-language file whose code depends on macros, there are three different ways in which the definitions of those macros may be made available to the code using them:

Macro definitions may explicitly appear within the assembly-language file in which they're used, in which case the scope of those macro definitions is that file alone.
Macro definitions may appear in a "macro library", and the macros defined in the library are accessible to any assembly-language file using that library. Speaking generally, the HAL/S compiler's runtime library (which is written in AP-101 assembly language) has a macro library known as RUNMAC, whereas the Shuttle flight software has a macro library known as MLIB80. But each release of the flight software has a version of MLIB80 specific to that release.
Macro definitions may appear, along with non-macro code, in AP-101 assembly-language files "included" by other AP-101 assembly-language files via an assembly-language pseudo-op called COPY. (Note: COPY'd files supposedly cannot contain macro definitions in System/360, but can do so in AP-101.)

In the current implementation of ASM101S, macro libraries are literally just folders of assembly-language files. Recall from the preceding section that ASM101S is invoked with one or more optional command-line arguments of the form --library=LIBRARY. LIBRARY is just a path to a macro-library folder. Macro definitions in any libraries specified in this manner are loaded by the assembler along with the specific source-code file(s) being assembled, thus automatically making all of the macro definitions in that library available during the assembly process.

Relative to where AP-101 assembly-language source code is stored in the source-code tree, ASM101S would typically be used with either the option --library=../RUNMAC or --library=../MLIB80, assuming the current working directory was the one storing the source-code files being assembled.

There's a slight problem, though, in that for some reason, all of the assembly-language files included via COPY pseudo-ops are also located within the macro libraries, intermixed with the files intended to contain only macros. But we do not want any of the code from these COPY'able files (even if there are some macro definitions within them) to be automatically be made available during assembly. Rather, we want their code to be made available only when they're COPY'd! Or to put it differently, categories #2 and #3 of files containing macro definitions, as discussed above, must be mutually exclusive.

It's unclear to my why the Shuttle developers chose to house these mutually-exclusive categories of files together in the same directory — or as they thought of it, the same "Partitioned Data Set" (PDS) —, nor how they handled this ambiguity in their assembler. As for ASM101S, though, it handles the ambiguity as follows:

If a file in the macro library contains only macro definitions (and comments and blank lines), it is automatically loaded by the assembler (category #2).
If a file in the macro library contains anything other than macro definitions (and comments), it is not automatically loaded, and can only be loaded via COPY (category #3).

ASM101S does not attempt to determine these distinctions for itself. Rather, the files in the macro library (or libraries) must have been preprocessed in such a manner as to determine which of the two categories each file in the library falls into. Each macro library is assumed to contain a file called MACROFILES.txt containing this information, and ASM101S simply uses the categorization provided by MACROFILES.txt. The format of MACROFILES.txt is that it lists the names all of the macro-definition files, one per line. Full-line comments (having a semicolon in column 1) are also allowed.

Aside: A utility program (makeMACROFILES.py) is provided to create MACROFILES.txt. Admittedly, insofar as legacy code related to Shuttle flight software is concerned, this is probably of little interest to you, the end user, since all such preprocessing is likely to have been performed prior to you seeing any of the assembly-language source-code files anyway. But if you do happen to acquire flight software or other AP-101 software from sources other than Virtual AGC — send it to me! — then I suppose you might need to do the preprocessing yourself.

Installation of ASM101S

For Linux, Mac OS, or Windows. If the HAL/S compiler (HALSFC) has been installed per the instructions, then ASM101S will automatically be available as well.

If for some inexplicable reason you want to have ASM101S just for itself, without the HAL/S assembler (or any of the AP-101 source-code files) provided by the normal installation, you could instead just download the file ASM101S.py. You simply need Python 3 to run it.

Aside: If you choose the latter installation method, I can only assume that you already have some AP-101 source-code files that you want to assemble. You might consider sending them to me.

Where Are the AP-101 Assembly-Language Files?

HAL/S-FC runtime library:

Shuttle FSW OI-30.17 (presently restricted, sorry!):

Source-code files
Assembly listings
Macro library

Shuttle FSW OI-34.06 (presently restricted, sorry!):

Source-code files
Disassemblies
Macro library

AP-101 assembly-language source-code files can also be obtained somewhat indirectly by the trick of compiling HAL/S source-code files, and then extracting assembly language from the reports produced by the HAL/S compiler's code-generation pass. The page covering the HAL/S compiler shows you some examples of how to run the HAL/S compiler, but to summarize it briefly, suppose you have a HAL/S source-code file called SOURCE.hal in the current working directory. To compile it, you might use the command

HALSFC SOURCE.hal "" "LIST"

This operation creates a new folder containing the results of the compilation, as well as all of the intermediate files HALSFC creates during the compilation process. This folder will be the newest one with a name of the form "*.results", such as "HALSFC Wed Aug 21 07:14:36 AM CDT 2024.results". The file that's of interest to us in that folder will be the one called "pass2.rpt".

The pass2.rpt file cannot be directly assembled by ASM101S, because it contains a lot of stuff other than just assembly-language source code. But a script called "extractAP101S.py" has been provided that can extract just the AP-101S assembly language from pass2.rpt into a file that can indeed be directly assembled:

extractAP101S.py <pass2.rpt >SOURCE.asm
ASM101S SOURCE.asm

It happens that pass2.rpt itself is structured much like the assembly listings produced by ASM101S, in that it includes (among other things) not only the source code but also the binary form of the object code and some tables. This similarity can be exploited as an additional validity test for ASM101S, if you're as inclined to doubt its validity as I am.

Potential Differences from Expectations

Given that the connection between the AP-101 assembly language and the System/360 assembly language is undocumented (in surviving documentation) and is based only upon my own inferences, it's not surprising that there are some discrepancies between theory and practice, or between what I've implemented in ASM101S vs what's documented for IBM 360 assembly language. I'll explain those differences in the subsections below.

Assembly Listings

By an "assembly listing", I mean a printout from the assembler itself, typically showing how each line of source code has been transformed into binary codes, and providing useful extra information such as symbol tables and other cross references. You've seen an example (for the ACOS module) earlier.

Unfortunately, in spite of the claims to the contrary I've made in earlier sections, there are no surviving assembly listings produced by the AP-101S original assembler that I'm aware of, or even substantial fragments of such listings. (If you notice any, be sure to call my attention to them!) Therefore, without any of the original assembly listings to mimic, assembly listings as produced by ASM101S are unlikely to match those of the original assembler with exactitude .... though of course I expect the same binary codes to be produced at the same addresses, since if not, then the entire exercise of creating ASM101S in the first place would be pointless. But even if I had such original assembly listings, one wouldn't expect them to be any guide as to the wording or format of warning or error messages produced by the assembler, since any Space Shuttle flight software source code available for assembly presumably would be error-free, at least to the point that no warning or error messages are likely to appear in any assembly listings.

With that said, there is some assembly-listing-like material available, and those are what I have referred to in earlier sections. Among the files presently publicly visible, I refer to the folder called RUNLST in our source-code repository, which naively appears to be assembly listings generated by assembling the files in the repository's RUNASM folder. RUNASM contains the AP-101S assembly language source code, in conjunction with the macro library folder RUNMAC, and assisted by the interface-file folder ZCONASM, for the runtime library used with AP-101S object code created by the HAL/S compiler, HAL/S-FC.

Upon close inspection, however, the contents of RUNLST cannot actually have been produced directly by the original AP-101S assembler. And similarly for materials not presently publicly visible. I assume, rather, that listings produced by the original assembler were stored somehow, probably in a so-called partitioned data set (PDS), and that the listings in RUNLST were produced by running some kind of report generator on those stored listings. Here's a fragment of the listing RUNLST/ACOS that we've seen earlier:

						.
						.
						.
				 28 ACOS     AMAIN ACALL=YES                                                00002200
                                 29+***********************************************************************
                                 30+*
                                 31+*        PRIMARY ENTRY POINT
                                 32+*
                                 33+***********************************************************************
00000                            34+ACOS     CSECT                                                          01-AMAIN
00000                            35+STACK    DSECT                                                          01-AMAIN
                                 36+*        DS    18H            STANDARD STACK AREA DEFINITION
00000                            37+         DS    F              PSW (LEFT HALF)                           01-AMAIN
00002                            38+         DS    2F             R0,R1                                     01-AMAIN
00006                            39+ARG2     DS    F              R2                                        01-AMAIN
00008                            40+         DS    F              R3                                        01-AMAIN
0000A                            41+ARG4     DS    F              R4                                        01-AMAIN
0000C                            42+ARG5     DS    F              R5                                        01-AMAIN
0000E                            43+ARG6     DS    F              R6                                        01-AMAIN
00010                            44+ARG7     DS    F              R7                                        01-AMAIN
                                 45+*        END OF STANDARD STACK AREA
00012                            46+SAVE6    DS    D              TO SAVE REGISTERS F6,F7                   02-00025
00016                            47+SWITCH   DS    F              TO SAVE R4 ACROSS INTRINSIC CALL          02-00026
00018                            48+STACKEND DS    0F             END OF COMBINED STACK AREA                01-AMAIN
00000                            49+ACOS     CSECT                                                          01-AMAIN
0000000                          50+         USING STACK,0        ADDRESS STACK AREA                        01-AMAIN
00000 E0FB 0018      0018        51+         IAL   0,STACKEND-STACK SET STACK SIZE                          01-AMAIN
00002 B624 0000      0009 0000   52+         NIST  9(0),0         CLEAR ON ERROR INFO (LCL DATA PTR)        01-AMAIN
                                 54 *COMPUTES ARC-COSINE(X) OF SINGLE PRECISION SCALAR                      00002300
                                 55          INPUT F0             SCALAR SP                                 00002400
0000000                          56+F0       EQU   0                                                        01-INPUT
                                 58          OUTPUT F0            SCALAR SP RADIANS                         00002500
						.
						.
						.

To anybody who is familiar with assembly language, this certainly looks like an assembly listing produced by an assembler, so why do I say that it's not? The first clue is the line numbering: There's a line 52 and a line 54, but no line 53. And there are lines 56 and 58, but no line 57. Admittedly, it's not 100% certain why that is, but having tried to track it down, it appears to me that both of those gaps correspond to uses of the SPACE pseudo-op appearing in expansions of the AMAIN and INPUT macros respectively. According to the assembly-language manual, "The SPACE instruction is used to insert one or more blank lines in the listing." And if so, where are the blank lines that should have been inserted?

Another clue, not apparent from the fragment above, is in the number of lines per page of the printout. Originally, an assembly listing would have been output to a line printer having (nominally) ~55 lines per page. Whereas the file in RUNLST have about 80 lines per page. Nor are there any embedded form-feed characters or other means to advance to the top of the next page before a page heading is printed. So I would again infer that these are not the original assembly listings. (On the other hand, I wasn't there, and I don't really know what printers were available to the developers, so arguing merely from the number of lines on the page isn't as conclusive as I might like.)

Regardless, the files of RUNLST are the best guide available as to the format of assembly listings, and hence ASM101S mimics that format to the extent feasible (i.e., to the extent not too pathetically obsessive), plus the addition of convenient indications of page breaks.

So when I refer throughout this webpage existing/surviving/legacy/contemporary "assembly listings", keep in mind that I'm referring to legacy reports such as those in RUNLST and not literally to assembly listings produced by the original assembler.

Character Set

The AP-101 character set does not match that of the System/360 assembler. The latter is the EBCDIC character set, or rather the variation of EBCDIC listed in Appendix A of the System/360 assembly-language manual.

On the other hand, as far as I can tell, the AP-101 assembly-language character set is not defined anywhere, and can only be inferred indirectly.

Examining character strings appearing in object files output by the HAL/S compiler HAL/S-FC, I find that quoted strings in HAL/S are in fact encoded in the ASCII-like character set of the Space Shuttle's Display Electronics Unit (DEU). Whereas all other text in HAL/S source code, such as symbol names, is encoded in EBCDIC. There are no corresponding surviving object files produced by assembly (rather than compilation), but my tentative operating principle is that all text in object files produced by the assembler is encoded in EBCDIC. For more explanation, as well as a listing of the DEU character set and its encoding, see the discussion of character encoding on the HALLINK101S page.

But wait, there's more! Any Space Shuttle flight-software source code available from Virtual AGC will have been "anonymized" by replacing personal names or initials with randomized identifiers beginning with either the ASCII carat (^) or backslash (\) characters, and thus either of these characters may appear in HAL/S or AP-101S source code even though they do not appear in the EBCDIC character set.

Finally, for technical reasons, ASM101S wants to reserve some ASCII character absent from the EBCDIC character set to represent breaks between punch-cards and their continuation cards (if any). The ASCII brace characters { and } meet these criteria. Therefore, ASM101S reserves them for its own internal purposes, and they should not be used in any newly-written AP-101 assembly-language source code, if such a thing ever exists. They are not used in any extant AP-101 assembly-language source code available to me.

Instruction Aliases

In IBM 360 Basic Assembly Language (BAL), various aliases exist for the branch instructions BCR and BC. These are described in Figure 4-1 of the assembler-language manual. While it is tempting to say that Figure 4-1 should be accepted as-is for AP-101S assembly language, that's unfortunately impossible: Conditional-branch instructions encode a "mask" to be applied to the CPU's condition codes, but the mask is 4 bits wide for System/360 and only 3 bits wide for AP-101S.

Something has to give! But Figure 4-1 does serve as a starting point for reverse-engineering AP-101 aliases for conditional-branch instructions. Here's my own list of AP-101S mnemonics for these branch instructions, grouped by condition-code mask. Except where otherwise indicated, they're all aliased to the BC instruction; but where marked in parentheses, they're instead aliased to BCR or BVC.

NOP,NOPR(BCR) — No Operation.
BH,BO,BP — Branch on High, Branch Over, Branch on Plus
BL,BM,BN — Branch on Low, Branch on Minus, Branch on Negative
BNE,BNZ — Branch on Not Equal, Branch on Not Zero
BE,BZ — Branch on Equal, Branch on Zero
BNL,BNM,BHE,BNN — Branch on Not Low, Branch on Not Less Than, Branch on Higher or Equal, Branch on Not Minus
BNH,BNP,BLE,BNO,BNC(BVC) — Branch on Not High, Branch on Not Plus, Branch on Less-or-Equal, Branch Not Over, Branch on No Carry.
B,BR(BCR) — Unconditional Branch

Note: While the mnemonics and condition masks in the list above are accurate (I hope!), the textual descriptions are less certain and should be taken with a grain of salt.

Aside: While I call this "aliasing to BC", in analogy to System/360, that's not exactly what's going on in AP-101S. In fact, in AP-101S each of these branching instructions (except NOPR and BR) are encoded as one of three different instructions (BC, BCB, or BCF), with the particular instruction chosen being dependent on the direction and distance of the branch being attempted. Similarly, a BCT instruction (branch on count) may instead generate the shorter machine code BCTB (branch on count backward) for short enough loops.

LHI: Besides the branch-instruction aliases, Shuttle flight-software code uses the operator LHI, but without any AP-101 instruction or any macro definition corresponding to it. There is such an instruction in IBM 360 assembly language. The AP-101S POO notes in its discussion of the LA instruction that there is a particular configuration of operands for which LA will be "functionally equivalent to a LOAD HALFWORD IMMEDIATE instruction". My guess is that the original assembler therefore accepted the mnemonic LHI but silently transformed it in the appropriate LA instruction. ASM101S treats it in that manner as well.

SHI: Similarly, flight software uses the non-existent SHI instruction. The program comments at those points clearly indicate that this is a kind of subtract-immediate instruction, presumably Subtract Halfword Immediate. Unlike the case of LHI, there is no corresponding SHI instruction for System 360. Nevertheless, we might suppose that the case is still similar, in that this could be an alias for (perhaps) a particular configuration of operands for some other AP-101 instruction. Fortunately, we have plenty of examples of assembly listings for code using SHI. Consider this example:

B0E5  FFFE                   SHI    R5,2

The value 0xFFFE is a halfword with the value -2, which leaves us to suspect that this is actually an addition. There is indeed an Add Halfword Immediate instruction (AHI), and "AHI R5,-2" would indeed assemble as shown.

LACR: There is no corresponding System/360 instruction to guide our thinking. However, there are lots of examples in AP-101 assembly listings, such as those for the CTOI.txt file of the HAL/S-FC runtime library. LACR is seen to be a register-to-register operation. For (say) general-registers N and M, it assembles to the bit pattern 11101nnn 11101mmm. This is the same pattern that the LOAD ARITHMETIC COMPLEMENT (LCR) instruction assembles to. Therefore, LACR is nothing more than a synonym for LCR.

PC: Similarly, this undocumented instruction is found from available assembly listings to assembly as a synonym for MVH (move halfword). There's no rationale obvious to me for the specific mnemonic "PC" for this operation.

Unused Pseudo-ops

Not all pseudo-ops described in the System/360 assembler manual appear in surviving AP-101 assembly-language source code. I've chosen to believe that rather than the omissions being coincidental, those pseudo-ops are instead specific to System/360 and thus had been entirely omitted from AP-101 assembly-language. Admittedly, that inference is probably wrong in the case of certain of the pseudo-ops. Nevertheless, they have not been implemented in ASM101S.

The omitted pseudo-ops are:

Obviously, this list is subject to change, if legacy AP-101 assembly-language source code using any of these pseudo-ops is discovered.

Mystery Pseudo-ops

The SPOFF and SPON pseudo-ops — if they are pseudo-ops — seem typically to be used in pairs: SPOFF is used to disable something unknown, then an instruction or two later, SPON is used to re-enable whatever it was that SPOFF disabled. They are not pseudo-ops in IBM 360 assembly language, and hence must be specific to AP-101S.

Fortunately, we have a few contemporary assembly listings in which these pseudo-ops appear in the source code, and thus their effect can be observed somewhat. They do not generate any binary, hence they are definitely not instructions of any kind. Furthermore, they do not affect whether or not the source code they enclose is assembled, nor whether that source code appears in the assembly listing.

I would tentatively conclude that at least for the moment they can simply be ignored, and that's what ASM101S does with them for now.

Forbidden Pseudo-Ops in `COPY`'d Files

The System/360 assembler manual tells us that assembly-language files included in other assembly-language files via the COPY pseudo-op cannot contain various other pseudo-ops, two of which are MACRO and MEND. That implies that a COPY'd file cannot contain any macro definitions. Nevertheless, Space Shuttle flight software has file inclusions that violate this restriction. Specifically, the files MLIB80/MACSMITH.asm and MLIB80/MACROS.asm do contain macro definitions, and yet are themselves COPY'd into other assembly-language files. Consequently, this restriction (at least insofar as MACRO and MEND are concerned) does not apply in AP-101 assembly-language.

Macro-Definition Prototypes

The assembler manual tells us that

"The macro instruction prototype statement (hereafter called the prototype statement) specifies the mnemonic operation code and the format of all macro instructions that refer to the macro definition. It must be the second statement of every macro definition."

For example, in a macro definition such as

         MACRO
         MYMACRO   &ARG1,&ARG2
         .
         .
         .
         MEND

no other statements must appear between the first two lines shown here.

In contradiction to the claim in the manual, though, there are instances in flight-software code in which there are comments between these lines, as in

         MACRO
.* THIS IS A COMMENT
.* THIS IS ANOTHER COMMENT
         .
         .
         .
.* THERE WERE  A WHOLE LOT OF COMMENTS, SEE?
         MYMACRO   &ARG1,&ARG2
         .
         .
         .
         MEND

I guess we'd infer from this, and very reasonably, that comments are not "statements", but more importantly, that the macro prototype is not necessarily the second line in a macro definition.

Aside: I don't know if anybody will read these words, ever, but my sixth sense tells me that some folks who do might be smugly saying to themselves right now that "of course full-line comments are not 'statements' in any language, so what's this fool on about?" As it happens, on p. 69 of the assembler manual, we find a section actually entitled "Comments Statements", which proceeds to define the term comments statement as being precisely the thing we're discussing right now. $<img src="smiley.png">$

Silly Suffixes

Aside: AP-101 CPU instructions fall into 5 categories, depending on the pattern of operands they accept. These 5 categories are designated RR, RS, SRS, SI, and RI. The differences between these relate to the number of operands and the means of addressing them, but the specifics aren't important for our discussion here.

All AP-101 CPU instructions of type RS can optionally have suffixes "@", "#", or "@#" added to their mnemonics. For example, just as there is an SCAL instruction of type RS, there are also SCAL@, SCAL#, and SCAL@# instructions of type RS.

To be picky about it, this usage is indeed documented, but it took me so long to figure out that I thought I should take explicit notice of it here anyway.

The AP-101S POO tells us that

"... [@] [#] indicates that the use of indirect addressing and/or autoindexing is optional. For example, [instruction mnemonic] M specifies direct addressing without autoindexing, while M# specifies direct addressing with autoindexing."

And in case it's not obvious to you what the POO means by "indirect addressing" and/or "autoindexing", there is much greater detail in the POO's explanation of the general characteristics of RS-type instructions, though you won't be any wiser about the origin or rationale of the term "autoindexing" after reading the explanation than you have beforehand.

Aside: As usual, I suppose, the implications of this are more complex than might be supposed at first glance. Not to mention probably being more than you want to know! Technically, if the @ suffix is present, a bit-field known as "IA" in the encoded machine instruction is set to 1 (vs 0 in the absence of the suffix). Similarly, a bit-field known as "I" in the encoded instruction is set to 1 in the presence of the # suffix. The "more-complex" implication is that the "I" field may be set to 1 sometimes even in the absence of the # suffix. This happens, for example, in a branch backward to an earlier address (vs a branch forward to a later address). In such a case, the assembler sets the "I" bit on its own, without the suffix #, because one of the special addressing modes I alluded to earlier is the case where you have an RS-type instruction
OPCODE R1,D2(X2,B2)
in which

The index register (X2) being used is general register 0 (or absent from the operand)

And the base register (B2) being used is general register 3 (of absent from the operand)

And the "IA" bit-field is 0 (i.e., OPCODE has no suffix @)

And the "I" bit-field is 1

If so, then at execution time the displacement D2 is subtracted rather than added to the updated instruction counter. In particular, it allows backward branches from the current location.

Mystery Instructions or Macros

The following operators appear in flight software source code, and I am so far unable to determine if they are supposed to be instructions, macros, or pseudo-ops:

LB is used in the same manner as the instruction LA (load address), but it is a mystery what it signifies. I presently have no examples of assembly listings containing it from which I can deduce anything. One possibility is that it's simply an error never detected during Shuttle software development because the only known usage is in the definition of a macro (LD) which in fact is never invoked by any of the other source code. In other words, perhaps ASM101S shouldn't have been getting uppity by trying to parse that macro in the first place!
AREAD
BC$
STH$

Syntax of Various Fields

Recall that the four fields potentially present in a line of assembly language (whether instructions, pseudo-ops, macro invocations, etc.) are the name field (beginning in column 1), the operation field, the operand field, and the comment field. It turns out that parsing these fields is quite tricky, particularly the operand field.

I won't bore you with the details as to why this is so, but simply say that except for the comment field, each of these fields has been given its own simple BNF-style grammar in ASM101S, and sometimes multiple separate grammars that are applied for different contexts. This is, of course, transparent to the user of ASM101S, and is only significant to someone wishing to maintain the assembler.

Aside: "BNF", of course, stands for Backus-Naur form. Technically, the grammars are actually written in the modified EBNF (Extended Backus-Naur form) supported by the TatSu parser module for the Python language. See the Python source-code file fieldParser.py for the grammars themselves.

Nevertheless, even having adding this level of complexity to the parser, it's not necessarily the case that the syntax parsed by ASM101S matches that parsed by the original assembler. For example, arithmetic expressions as specified by the System/360 assembly-language manual are constrained in various ways — e.g., cannot begin with '+' or '-', cannot have have more than 16 terms, cannot have more than 5 levels of parentheses —, but have not been endowed with the same constraints in ASM101S. On the other hand, I haven't necessarily bothered to implement theoretically-possible syntax that isn't present in actual flight software. Consequently, it's likely that ASM101S accepts a more-complex syntax in some contexts than did the original assembler, and vice-versa. Or course, ASM101S can be upgraded as needed to support such missing syntax, if it turns out to be desirable, whereas the original assembler cannot.

EQU and CPU Registers

The AP-101 CPU has 8 general registers, typically referred to symbolically in assembly language as R0 through R7, as well as 8 floating-point registers, typically referred to as F0 through F7. This is the same situation as in System/360 assembly language, except that in System/360 there are more of each kind of register. For example, an assembly-language instruction that performs an integer addition from register R7 to register R3 would look like this in either of the two assembly languages:

AR	R3,R7

But there's a catch. The assembly-language manual explains that

"All symbols that specify register numbers ... must be assumed to be equated elsewhere to absolute values."

In other words, the register-name symbols R3 and R7 in this example are not tokens or syntactical elements of the assembly language, and the pure syntax for the instruction example shown above should actually be this:

AR	3,7

The only reason that the former instruction would be accepted by the assembler, the manual is explaining, is that the full example should have read something like this:

R3      EQU     3
R7      EQU     7
	.
	.
	.
        AR      R3,R7

In turn, this means that in the macro libraries loaded by the assembler, we should should find various EQUates similar to the ones above, for the general registers and floating-point registers. And indeed, for the macro libraries used for the Space Shuttle primary flight software (PASS), and backup flight software (BFS), we find exactly such declarations in the PASS module MLIB80/MACSMITH or the BFS module MLIB80/EQU, along with numerous other EQUates of a similar nature:

         .
         .
         .
F0       EQU   0              FP 0 = FLOATING POINT REGISTER            
F1       EQU   1                 1                                      
F2       EQU   2                 2                                      
F3       EQU   3                 3                                      
F4       EQU   4                 4                                      
F5       EQU   5                 5                                      
F6       EQU   6                 6                                      
F7       EQU   7                 7                                      
G0       EQU   0   SET 1      GR 0 = GENERAL REGISTER                   
G1       EQU   1                 1                                      
G2       EQU   2                 2                                      
G3       EQU   3                 3                                      
G4       EQU   4                 4                                      
G5       EQU   5                 5                                      
G6       EQU   6                 6                                      
G7       EQU   7                 7                                      
R0       EQU   0   SET 2      GR 0 = GENERAL REGISTER                   
R1       EQU   1                 1                                      
R2       EQU   2                 2                                      
R3       EQU   3                 3                                      
R4       EQU   4                 4                                      
R5       EQU   5                 5                                      
R6       EQU   6                 6                                      
R7       EQU   7                 7                                      
         .
         .
         .

Unfortunately, that's not the full story. Besides the flight software as such, AP-101 assembly-language files also exists in the runtime library provided by HAL/S-FC, the HAL/S compiler. Those assembly-language files reference the CPU general registers and floating-point registers just as any of the flight-software files do, except that there are no EQUates for those registers in any of those source-code files, nor in the macro library used by those files.

It is, of course, possible that the reason these EQUates are missing is that our HAL/S-FC runtime-library source code is incomplete. Unfortunately, there is no way to know whether that is correct or not. Another possibility is that the System/360 assembly-language manual is incorrect, and that the assembler does by default recognize the general registers Rn and float-point registers Fn, and possibly other symbols, without explicit EQUates.

Lacking any palatable alternatives here, ASM101S assigns default values to the various register symbols, but allows those defaults to be overridden by explicit EQUates, if such are encountered.

Type Attributes, `T'`

The System/360 assembly-language manual tells us prefixing a symbolic variable (such as &A) with the notation T' returns an assembly-time string consisting of a single character that corresponds to the type of data the variable contains. For example, if &A were a character-string variable as declared via the GBLC or LCLC pseudo-op, then the assembler's preprocessor would replace T'&A by the single character C at assembly-time.

The manual lists 27 such "types", corresponding to the "letters" A-Z and $. (In the worldview of the assembler, 29 characters are defined as being "letters": A through Z, #, $, and @.) But it isn't clear at the present time how many of these types will be supported in ASM101S, since only the following seem to appear in actual flight-software source code:

C — character constant
N — self-defining term
O — omitted operand
# — (not mentioned in the manual)

It isn't entirely clear to me what # indicates. My current very tentative interpretation is this:

If applied to the formal parameter of a macro, it means that the formal parameter is a position-dependent parameter. I.e., that it is not a named parameter.
If applied to a character-string variable, the variable is assumed to contain the name of a formal parameter of a macro, and the formal parameter is position-dependent.

The `D'` Attribute

AP-101S assembly-language source code uses an attribute operator D', which is not defined in the assembly-language manual. From the way it is used, I infer when applied to an identifier, it returns "true" (1) if the identifier has been previously defined within the source-code being assembled and "false" (0) if not. A typical usage would be something like

	AIF     (D'MYSYM).OKAY
	EXTRN	MYSYM
OKAY    ...

Thus if the identifier is not defined, it allows the code to detect that condition and to mark the identifier as being declared externally.

`AIF` and `AGO`

The AIF and AGO pseudo-ops provide "goto" functionality (respectively conditionally or unconditionally) at assembly time (rather than at runtime). The System/360 assembly-language manual makes it clear that these "goto" operations can operate only with the same macro depth, and further, if within a macro, only within the same macro. For example, in the "pseudo-instruction"

AGO .MYSEQ

the locations of the pseudo-instruction itself and of the sequence symbol .MYSEQ could be both outside of any macro, or they could be within the same macro definition. But it could not be the case (say) that the pseudo-instruction was within a macro definition and the sequence symbol was within a macro invoked by that macro.

One important case about which the assembly-language manual says nothing, I think, is the case in which a source-code file is being imported via a COPY pseudo-op. Is it possible for the AGO or AIF pseudo-instruction to be in a file containing a COPY pseudo-op while the target sequence symbol is in the file being COPY'd? Or vice-versa?

ASM101S does not allow the case just mentioned. In other words in ASM101S, for any file being imported via COPY, any AGO/AIF pseudo-instruction and its target sequence symbol must reside within the same COPY'd file.

I do not presently know if this usage occurs within Shuttle flight software or not.

Arithmetical Peculiarities and Evaluation of Expressions

Certain arithmetical quirks are inherent in System/360 assembly language, and I must presume that these peculiarities carry over into AP-101S assembly language as well. Therefore, ASM101S retains these peculiarities rather than eliminating them.

The peculiarities I regard as worth noting are these:

As far as I can tell, there are no negative-number literals, nor are there any legal arithmetical expressions that are superficially equivalent to them. For example, you could not assign the symbolic variable &A the value -5 via a line of pseudo-code such as "&A SETA -5", because "-5" is neither a legal literal nor a legal arithmetic expression. (Nor would +5 be legal.) The assembly-language manual seems to advise using workarounds such as "&A SETA 0-5".
Division by 0 is allowed, and always results in 0. For example, 5/0 is equal to 0.
Arithmetic expressions are always integer, so the "/" operator is integer division. For example, 5/2 evaluates to 2. The manual does not explain what the result of an operation like (0-5)/2 would be; either -3 or -2 is plausible. Until the correct behavior is somehow determined, ASM101S uses the Python convention (i.e., it uses the Python // operator), which would result in -3 in this case.
Operator precedence: Consider an expression consisting of "terms" that are separated by the operators +, -, *, or /. ("Terms" is quoted here to distinguish the System/360 usage from the normal mathematical usage, in which terms are added or subtracted from to/from each other, while factors are multiplied or divided to/by each other.) These operations are performed in left-to-right order, except that multiplications or divisions are performed prior to additions or subtractions. In particular, in an expression like 3*5/2, division does not have a higher precedence than multiplication, so it is evaluated as (3*5)/2 rather than as 3*(5/2).
The final result of evaluation of any arithmetic expression is always truncated to 24 bits (2's complement), but intermediate results are 32 bits.

On the other hand, ASM101S does remove some of the constraints of System/360 assembly-language arithmetical restrictions, namely:

Expressions can contain any number of terms, rather than being limited to a maximum of 16.
Expressions can have any depth of parentheses, rather than being limited to 5 levels.

Aside: Regarding peculiarities of my own making, as opposed to those of the language itself or the original assembler, I'm obliged to admit that I don't quite understand how to perfectly handle assembly-time evaluation of arithmetic expressions involving program labels: i.e., involving the addresses of symbols rather than the values of constants.

To do so, ASM101S instead uses an imperfect trick, making use of the facts that the address space of the AP-101S is limited to 24 bits and that the number of allowed control sections in a program (at least in System/360) is limited to 255. The addresses of program labels (prior to linking) is precisely an ordered pair of the form (control section, offset into control section), but performing arithmetical computations is easiest when these values can somehow be converted to single numbers rather than ordered pairs. The trick is to assign each control section a unique but randomized 64-bit value whose least-significant 24 bits are all 0, and to convert addresses of symbols to a sums of these 64-bit values plus 24-bit offsets into the control sections. (I don't mean that the codes for the symbols are actually random, but rather that they are selected in a way that makes it unlikely to produce their values by common types of calculations.) In this way, calculations like SYMBOL+OFFSET or SYMBOL1-SYMBOL2 (for symbols in the same section) produce the expected results, and indeed, produces correct results for all correct expressions. Unfortunately it remains possible to combine symbols in an incorrect manner from two different control sections and get a result that appears to be in yet a third control section, which is incorrect. This potential is part of the reason for using 64-bit pseudo-addresses (and distributing the unique numerical codes for the control sections throughout a 40-bit space) rather than 32-bit pseudo-addresses (and distributing the unique numerical codes in an 8-bit space): It reduces to a very low level the probability of producing "fake" control sections in calculations.

According to the System/360 assembly-language manual, although EXTRN symbols can appear in expressions, they cannot be paired. This implies, I think, that they can be handled interoperably with the description in the preceding paragraph, by using unique but randomized 64-bit values with the lower 24 bits all 0 in place of those symbols.

I thought at first that the same trick could be used to handle calculations other not-yet-defined symbols. Unfortunately, such an attempt would be guaranteed to produce incorrect results in calculations like KNOWN-UNKNOWN, even if KNOWN and UNKNOWN both turned out to be members of the same control section. Therefore, the addresses of all symbols in the current file must be ascertained in a separate pass before computations of expressions involving such symbols are performed.

Relational Expressions Involving Strings

Among the types of expressions computed by the assembler at assembly-time for use with pseudo-ops such as SETB or AIF are the boolean expressions, of which one sub-type is relational expressions involving string values.

A relational expression is used to determine that two values (either two numbers or two strings) are equal (EQ), not-equal (NE), less-than (LT), less-than-or-equal (LE), greater-than (GT), or greater-than-or-equal-to (GE) each other. For example, the relational expression

3 LT 4

returns the value "true" (which in System/360 assembly language is numerically equivalent to 1) since 3 is less than 4.

System/360 assembly language shares the obnoxious (in my opinion!) property of string comparisons in the XPL language that a shorter string is always "less than" a longer string. For example,

'Z' LT 'AA'

returns "true".

Unfortunately, as far as I can tell, the System/360 assembly-language manual does not explicitly state the collation sequence to be used for comparing strings of equal length, though it seems to me to be implied that it is based on the EBCDIC encoding of the characters.

As for the AP-101 assembler, however, the conclusion is not so straightforward, and in fact is TBD.

If we use the "cross reference" (symbol table) printed in legacy assembly listings printed by the original assembler — an approach of dubious validity —, we find relationships like this:

digit < alphabetic < '$' < '#' < '='

This is consistent with neither an EBCDIC collation nor an DEU/ASCII collation:

(EBCDIC) '$' < '#' < '=' < alphabetic < digit
(DEU/ASCII) '#' < '$' < digit < '=' < alphabetic

Thus in the end we really don't know what collation sequence is appropriate. ASM101S temporarily pretends that the collation sequence is ASCII since that's the easiest to implement.

Character Expressions

Character expressions consist of text delimited by single-quotes, as for example 'HELLO', plus various additional flourishes that you can read about in the System/360 assembly-language manual but which I won't bother to rehash here.

One flourish which must be mentioned is the so-called substring notation, which can be used to extract a substring from a string, as in:

'HELLO'(start,length)

This means that the substring to be extracted begins at index start and is length characters in width.

One trivial detail which the manual doesn't seem to think worth explicitly mentioning, as far as I have been able to ascertain so far, is whether the indexing of the string characters is 0-based or whether it is 1-based. As curious as it seems, sticklers for detail might think this information could be valuable from time to time. If you read far enough into the manual, there are eventually a couple of examples which indirectly demonstrate that indexing of the string is 1-based.

Declaration of "SET Symbols"

Before describing the specific AP-101S versus System/360 issue associated with the items known as "SET symbols", let me summarize some of what the System/360 assembly-language manual has to say about them.

In System/360 assembly language there is the concept of symbols relevant only to the assembler in a preprocessing pass, in distinction to symbols representing addresses in the runtime memory of the assembled program. These symbols are distinguished in that their names are prefixed by the character '&'. Thus MYVAR might be a variable representing a memory location, whose contents can be modified by the assembly-language program when it is run, while &MYVAR might represent an assembly-time variable, assigned a value that can be manipulated during the assembly process, but that is not known or modifiable by the assembled program.

These assembler-only variables can be classified in a number of ways, one of which is that they can be of one of three mutually-exclusive types:

"Symbolic parameters" are the formal parameters found in macro definitions. They are assigned values at the time of the invocation of the macro, but cannot otherwise be changed.
"System variables" are assigned values by the assembler itself, conceivably different on each use of the variable, and cannot be changed by software. They're distinguished by the fact that they always begin not just with "&" but with the 4-character string "&SYS".
"SET symbols" can be explicitly created, assigned values, and reassigned values by software at will.

Here, we're concerned only by the latter category, namely the SET symbols.

SET symbols can be categorized another way, namely by their datatypes, which cannot be changed once established. The three types are:

Integer
Boolean
Character string

Yet a third way that they can be characterized is as:

GBL: "global" variables accessible identically in all software, whether inside or outside of macro definitions and invocations.
LCL: "local" variables accessible only within a single invocation of the macro definition in which they are defined. (Or else if declared outside of all macro definitions, accessible only outside of all macro definitions or invocations.)

Prior to the first use of any SET symbol, it must be declared via of the macro-language instructions GBLA, GBLB, GBLC, LCLA, LCLB, or LCLC. Any of these instructions also assigns an initial value the symbol, either 0, False (0), or '' (empty string), depending on the datatype. For example, the instruction "

LCLB
        &BOO

" declares a local boolean SET symbol called &BOO and assigns it the default value False (numerically, 0).

After declaration, the value of a SET symbol may be changed (within its global or local scope, as appropriate) via one of the macro-language instructions SETA, SETB, or SETC.

Okay, that was the background, but here's the AP-101S specific issue: In actual AP-101S assembly-language source code, there are SET symbols modified by SETA, SETB, or SETC (or used in other manners) without any declaration via GBLA, GBLB, GBLC, LCLA, LCLB, or LCLC whatsoever (prior or otherwise), which is a possibility denied by the System/360 assembly-language manual.

For example, consider the INPUT macro, provided as part of the AP-101S runtime library by the original source code of the HAL/S compiler HAL/S-FC. It has four SET symbols that are used without declarations, highlighted in green in the listing below:

         MACRO                                                          00000100
         INPUT &X                                                       00000200
         GBLA  &ENTCNT                                                  00000300
         GBLB  &INPUT(20),&LIB                                          00000400
         AIF   (N'&SYSLIST EQ 0).EMPTY                                  00000500
&INPUT(&ENTCNT) SETB 1                                                  00000600
         AIF   ('&X' EQ 'NONE').SPACE                                   00000700
&I       SETA  1                                                        00000800
&LAST    SETA  N'&SYSLIST                                               00000900
.LOOP    AIF   (K'&SYSLIST(&I) NE 2).BADREG                             00001000
&R       SETC  '&SYSLIST(&I)'                                           00001100
         AIF   ('&R'(1,1) NE 'F' AND '&R'(1,1) NE 'R').BADREG           00001200
         AIF   ('&R' EQ 'R0').BADREG                                    00001300
         AIF   (&LIB AND ('&R' EQ 'R1' OR '&R' EQ 'R3')).INVREG1        00001400
         AIF   (NOT &LIB AND '&R' EQ 'R4').INVREG2                      00001500
         AIF   (D'&R).NEXT                                              00001600
&N       SETC  '&R'(2,1)                                                00001700
&R       EQU   &N                                                       00001800
.NEXT    ANOP                                                           00001900
&I       SETA  &I+1                                                     00002000
         AIF   (&I LE &LAST).LOOP                                       00002100
.SPACE   SPACE                                                          00002110
         MEXIT                                                          00002200
.BADREG  MNOTE 4,' ILLEGAL REGISTER SPECIFICATION - &SYSLIST(&I)'       00002300
         AGO   .NEXT                                                    00002400
.INVREG1 MNOTE 4,'&R INVALID INPUT FOR PROCEDURE ROUTINE'               00002500
         AGO   .NEXT                                                    00002600
.INVREG2 MNOTE 4,'R4 INVALID INPUT FOR INTRINSIC'                       00002700
         AGO   .NEXT                                                    00002800
.EMPTY   MNOTE 4,'OPERAND REQUIRED'                                     00002900
         MEND                                                           00003000

What are we to make of this?

Upon considerable reflection, my inference is that the AP-101S has a built-in convenience feature, either not present or not documented in the System/360 assembler, namely this:

When a variable that has not previously been explicitly declared (by GBLx or LCLx) is the target of a SETx instruction, it is declared automatically by the assembler as if via LCLx.

Aside: If this inference is correct, it might seem naively that there's no need for the instructions LCLA, LCLB, or LCLC at all, since a SETA, SETB, or SETC could always be used instead. Upon closer inspection that's not true, since LCLx (like GBLx) can additionally be used to declare SET symbols as arrays, which a SETx instruction with this convenience feature could not. And even in the non-arrayed case, there are certainly instances in existing code in which LCLx is indeed used explicitly even though the described convenience feature would not require it. For example, consider this macro from the AP-101S runtime-library source code, which unlike the problematic macro listed above corresponds exactly to the System/360 assembly-language manual's pronouncements:
         MACRO                                                          00000100
&NAME    AERROR &NUM,&GROUP=4                                           00000200
         GBLA  &ERRCNT,&ERRNUMS(10),&ERRGRPS(10)                        00000300
         LCLA  &I                                                       00000400
         AIF   (&NUM GT 62).BADNUM                                      00000500
&I       SETA  &ERRCNT                                                  00000600
.DUPLOOP AIF   (&I LE 0).NEWERR                                         00000700
         AIF   (&NUM EQ &ERRNUMS(&I) AND &GROUP EQ &ERRGRPS(&I)).DUP    00000800
&I       SETA  &I-1                                                     00000900
         AGO   .DUPLOOP                                                 00001000
.NEWERR  ANOP                                                           00001100
&ERRCNT  SETA  &ERRCNT+1                                                00001200
&I       SETA  &ERRCNT                                                  00001300
&ERRNUMS(&I) SETA &NUM                                                  00001400
&ERRGRPS(&I) SETA &GROUP                                                00001500
.DUP     ANOP                                                           00001600
*********ISSUE SEND ERROR SVC****************************************** 00001700
&NAME    SVC   AERROR&I       ISSUE SEND ERROR SVC                      00001800
*********SEND ERROR SVC RETURNS CONTROL FOR STANDARD FIXUP************* 00001900
         MEXIT                                                          00002000
.BADNUM  MNOTE 12,'ERROR NUMBER GREATER THAN 62'                        00002100
         MEND                                                           00002200
As for the origin of such a convenience feature in the first place, I'd note that in addition to being "convenient", the complexity of some AP-101S macros could make some of those macros very difficult or impossible to implement otherwise. According to System/360 rules, all GBLx and LCLx instructions must appear not merely before SETx instructions involving the SET symbols they declare, but indeed prior to everything else. For example, GBLx instructions must appear immediately after the prototype line of a macro definition, with nothing intervening except comments, while LCLx instructions in turn must appear immediately after that. Thus if a macro definition depends on the flexibility of allowing a SET symbol to be declared in alternate ways under different circumstances, such as arrayed vs non-arrayed or integer vs character, the rules of the System/360 assembler likely would not allow it because alternate declarations could appear in the prescribed location. Whereas the rules of implicit declaration via SETx instructions basically allow non-arrayed local declarations to appear anywhere. So the convenience feature of implicit declaration, if it truly exists, could have arisen from necessity rather than from a desire for mere convenience. Not that "mere" convenience is to be sneered at. But that's just speculation on my part, with the answer lost in the mists of time past.

`DC` and `DS` Pseudo-Op Formats, and "Literals"

The System/360 assembly-language manual describes a quite-complex format for the operands of the DS and DC pseudo-ops used for allocating or initializing data memory. (The description takes about 11 pages, which is over 6% of the manual.) However, I see no point in implementing those features of this format which are not actually used in Space Shuttle flight-software source code. At present, I believe that the following features of the DC/DS format do not need to be supported in ASM101S:

Type codes L, P, Z, S, V, Q
Scale modifier
Exponent modifier

The System/360 manual uses the term literal to refer to an operand (for instructions) in a manner differently than I have been doing (and differently from what I think is current common usage). I have been using the term to describe strings of characters such as 1234 or 'HELLO WORLD', or perhaps X'3F7C'. System/360 (and presumably AP-101S) assembly language don't consider these to be "literals". Rather, the following would be considered "literals": =F'1234', =C'HELLO WORLD', or =X'3F7C'. The distinction, aside from prefixed equals sign and other syntactic elements, is that the former are used directly by DC pseudo-ops or in some cases coded into instructions, whereas the latter are instead assembled into special areas of memory known as "literal pools" and only their addresses are coded into instructions.

The reason this is relevant here is that ASM101S doesn't support the full range of literals accepted by the System/360 assembler, and instead limits itself just to those forms of literals appearing in Space Shuttle flight software, plus any others I just feel like implementing. (The latter being a tiny number indeed!) Those supported forms of literals are:

=B'...'		Binary
=C'...'		Character
=D'...'		Double-precision floating point
=E'...'		Single-precision floating point
=F'...'		Fullword fixed-point
=H'...'		Halfword fixed-point
=X'...'		Hexadecimal
=Y(...)		Nearby address
=Z(...)		Remote address

Regarding the =F'...' and =H'...' datatypes, see here to understand the significant differences between how the AP-101S and System/360 assemblers treat them.

In addition to these bare forms, the System/360 assembly-language manual also describes a "length modifier" for them, in which the initial letter specifying the datatype is optionally followed by "Ln", where n is a decimal number indicating the number of bytes of memory allocated. Similarly, for fixed-point datatypes (i.e., =F and =H) the "scale modifier" ("Sn") is supported. If both are present, the scale modifier must follow the length modifier.

Aside: The System/360 manual also describes variations on the allowed modifiers, such as "L(e)" (where e is an arithmetic expression), scale modifiers for floating-point datatypes (=E and =D), an "exponent modifier" for fixed-point and floating-point datatypes, and so on. ASM101S doesn't support any of these variations since they don't appear in Shuttle flight software source code.

While ASM101S supports these length and scale modifiers, the length modifier in AP-101S assembly language does not appear to be used as described in System/360. Consider this AP-101S instruction, which appears in the MSTR module of the HAL/S-FC runtime-library:

00013 27F7 0011      0026 0011   57          N     R7,=XL2'F'                                               00002200
					     .
					     .
					     .
00024                            75+         LTORG                                                          02-ERRPA
00024 FFFF000F                   76                =X'FFFF000F'
00026 0000000F                   77                =XL2'F'
00028 FFFFFFF0                   78                =X'FFFFFFF0'

The length modifier "L2" in the literal =XL2'F' clearly indicates that even though the provided constant value (F) is only one hexadecimal digit, it must assemble to 2 bytes, i.e. to hexadecimal 000F. And yet, in reality, we find that it has assembled to 4 bytes, as hex 0000000F. In point of fact, since an =X literal by default assembles to an even number of bytes (according to the System/360 manual), the length modifier should not even have been necessary in the first place, since it would merely be explicitly expressing the default.

What are we to make of this? Unfortunately — or perhaps fortunately, depending on your point of view! — there are not enough length modifiers appearing in surviving AP-101S assembly-listing reports for us to form much of an opinion on the matter just by examining those reports. In point of fact, the length modifier I've given in the example above is the only length modifier I've found in surviving assembly listings. Perhaps in AP-101S, length modifiers express the number of halfwords in the assembled value rather than the number of bytes? That's the way ASM101S treats them at present.

As a less-unruly example, scale modifiers appear to operate on an easier-to-understand basis than length operators. Thus, the literal =FS32'60E6' superficially appears to assemble to 60000000 (decimal, i.e. 0x03938700), but because of the scale modifier it is actually shifted rightward by 32 bit-positions, i.e. divided by 2³². So the value it assembles to is instead approximately 0.013969839. But that still doesn't mean that everything is exactly as in System/360. In AP-101S, the fixed-point literals have the interesting property (see the link given above about =F and =H datatype differences from System/360) that if they appear to be integers, then they assemble into 2's-complement integers, while if they appear to be fractional values with absolute value less than 1.0 they instead assemble into an alternate format maximizing the significance in that numerical range. That's what happens in this example. The literal assembles to 0x01C9C380, which amusingly, is just 60000000/2.

The =Z(...) format in AP-101S is unrelated to the =Z'...' ("zoned decimal format") of System/360, which is not supported by ASM101S and is best forgotten entirely. The =Z(...) of AP-101S has three parameters, thusly:

=Z(ARG1,ARG2,ARG3)

Naturally, as you may expect, =Z(...) is entirely undocumented. It is used in precisely one place, namely in the FCMNNIT module of each available version of Shuttle flight software. In flight software OI-30.17, that usage looks like so in the assembly listing produced by the original assembler:

                                 434          EXTRN FPMXQETB,FPMXQELE,FCMALLFS                               026120BQ
					      .
					      .
					      .
 000DE EAF3 0000      0000       438          LA    R2,FPMXQETB         DESTINATION - START OF XQE TABLE     026140BQ
 000E0 E2FB 001E      001E       439          IAL   R2,FPMXQECT*2       30 HWS TO INIT - (15 ENTRIES)        026145BQ
 000E2 1BF7 00EE      01D2 00EE  440          L     R3,=Z(,FPMXQETB+2,0)  SOURCE ZCON ADDRESS                026150BQ
					      .
					      .
					      .
 001D2                           734          LTORG                                                          050300BG
 001D2 00020000                  735                =Z(,FPMXQETB+2,0)

Since the symbol FPMXQETB is declared as an EXTRN, its address and other characteristics are unknown at assembly time, though in fact FPMXQETB is the start of something called the "XQE table", aligned at a fullword address, a fact presumably known to the human programmer if not to the assembler program.

I think that =Z(...) is used to form "ZCON" addresses. The ZCON compiler option for the HAL/S compiler (HAL/S-FC), in the words of the "HAL/S-FC User's Manual", has the effect of "[causing] calls to out-of-line routines (external references) to be performed via long indirect address constants". It appears to me that the term "ZCON" probably stands for "Z constant", as a contrast to "YCON" for constants formed via =Y(...), which are just 16-bit displacements. The distinction is that LOCAL data (in the HAL/S sense) can be accessed by efficient YCON addressing, whereas REMOTE data (again in the HAL/S sense) is accessed via less-efficient ZCON addressing.

It further appears to me (see Figure 2-14 of the AP-101S POO) that a 16-bit base address fills the top halfword of such an indirect address constant, while an 11-bit displacement (padded with 0's to 16 bits) fills the bottom halfword. My best guess is that =Z(ARG1,ARG2,ARG3) is perhaps better represented as =Z(TBD,BASE,DISPLACEMENT).

Thus in the assembly-listing excerpts shown above, FPMXQETB+2 is the BASE address, though the external symbol is simply assembled as having address 0, since its address is unknown to the assembler (later to be fixed up by the linker) causing FPMXQETB+2 to assembled as just 2. Similarly, the DISPLACEMENT is 0.

Aside: I don't claim that all of the mysteries of =Z(...) have been solved by the meandering inferences above They do explain, more or less, how the assembler should turn =Z(,FPMXQETB+2,0) into 0002 0000. But what happens after the object code leaves the assembler and is processed by the linker? By process of elimination, the assembly-listing excerpts above came from flight software version OI-30.17, because those are the only contemporary assembly listings available to me. But the associated source code is identical in flight software version OI-34.06, for which I have no contemporary assembly listing, but for which I do have a contemporary disassembly of the linked object code, thus giving us some clues about how the linker treats these =Z(...) literals. What we find in the linked OI-34.06 is that FPMXQETB ends up at address 008B6A, and that the =Z(,FPMXQETB+2,0) constant itself becomes 8B6C 0001. The upper halfword is precisely what we would have expected! But what about the lower halfword? Why has the linker turned 0000 into 0001? That's a mystery to me. In case you're interested, here's what the relevant portions of that contemporary disassembly of the FCMNINIT module in OI-34.06 look like:
 008B6A         FCMCBLKS+0A50                           FPMXQETB DS     0F
				    .
				    .
				    .
 018A52-018A53  FCMNINIT+00DE       EAF3  8B6A  008B6A           LA     R2,X'8B6A'             FPMXQETB
 018A54-018A55  FCMNINIT+00E0       E2FB  001E                   IAL    R2,X'001E'
 018A56-018A57  FCMNINIT+00E2       1BF7  00F6  018B4E           L      R3,X'00F6'             =Z''
				    .
				    .
				    . 
 018B4E-018B4F  FCMNINIT+01DA       8B6C  0001                   DC     Z

The "=E" (single-precision floating-point) and "=D" (double-precision floating-point) datatypes also require a few additional words of explanation. While literals (or constants) of these types are provided in AP-101S assembly-language source-code in the usual decimal notation, they are encoded into object code by the assembler in "IBM hexadecimal floating-point" format ("IBM hex"). Regarding this conversion of decimal to IBM Hex, the System/360 assembly-language manual has this to say on the subject (emphasis mine):

"The number is converted to a binary number, and scaling is performed if specified. The binary number is then rounded and assembled into the proper field, according to the specified or implied length. The resulting number will not differ from the exact value by more than one in the last place."

In other words, the conversion algorithm used by the original assembler was not necessarily exact to the apparent full precision. Consequently, object code containing floating-point constants as produced by ASM101S cannot be guaranteed to byte-for-byte match the floating-point constants generated by the original assembler. This potential for inexact conversions is exacerbated by the fact that the native floating-point precision in the Python 3 language in which ASM101S is written does not precisely match the native floating-point precision of System/360; Python 3 is more accurate than System/360 in some cases, and less accurate in others, I believe. ASM101S works around this latter problem by performing the floating-point operations for such conversions at a higher precision than normal Python floating-point operations, namely 20 significant digits rather than 16. Empirically, the conversions by ASM101S match the conversions provided by the original assembler in all known cases, but there's no guarantee for presently-unknown cases. Nor is the range of the two floating-point formats (in terms of powers-of-ten exponentiation) identical; ASM101S makes no attempt to work around this range mismatch, because there are no offending examples in existing legacy source code.

Aside: Expressed in different terms, ASM101S (presumably!) does not use exactly the same algorithm as the original assembler to perform the conversion from decimal strings to IBM hex. That's because I don't presently know what that original algorithm was. Thus ASM101S simply uses a conversion algorithm of my own concoction. Perhaps at some point the original algorithm may be deduced or recovered somehow. If that happens, then it can be incorporated into ASM101S, in which case none of the concerns I've been describing will obtain.

I'd also note that while we do not have the source code of the original AP-101S assembler, the System/360 assembler presumably used the same conversion algorithm, and probably even the same source code to implement the conversion algorithm. I'm told that the source code for one version of the System/360 assembler is online, though I don't provide it in the Virtual AGC library due (in what's probably an excess of paranoia) to copyright concerns. Someone sufficiently immersed in System/360 assembly language might be able to abstract the conversion algorithm from examination of that source code. I looked at the System/360 source code myself and concluded, alas, that I am not such a person. If you want to try it yourself and give me the algorithm in easily-understood pseudo-code (or better yet, in Python 3), the System/360 assembler's source code is the file AS037F1.TXT, supposedly present somewhere in the archive at this hyperlink, though I've been unable to find it there myself. Good luck!

We might be tempted to disregard some potential precision error in the 32nd or 64th bit as being far too small to matter. The reason it matters is because it affects validation of the assembler in the following way: ASM101S has the capability of performing byte-for-byte comparisons of new assemblies vs legacy assemblies (where available), and thus discrepancies in even the least-significant bit could be flagged as errors. Our (or at least my) criterion for accepting ASM101S as valid is that these automatic comparisons detect no error. Even if we are prepared to accept such discrepancies as being all right, they nevertheless defeat automated validation.

By the way, regarding the just-mentioned automated comparison of generated code, it should also be noted that the assembly listings only display the first 16 hexadecimal digits of binary data generated by DC pseudo-ops, whereas (of course!) the entire value is available to ASM101S. Consequently, even if the compiler reports a 100%-successful automated comparison, generated data not printed in the assembly listings has not been checked and may not match.

SRS-Type Instructions Versus RS-Type Instructions

AP-101S instructions are of 5 basic types, designated (by IBM) as RR, RS, SRS, RI, and SI, based on the syntax patterns of their operands and on the ways they are encoded as machine instructions. Some of these are System/360 patterns, and some are not. I won't bore you with the details, as you can read about them in the AP-101S Principles of Operation. However, there is a certain difficulty with SRS- and RS-type instructions — as well as ambiguities between two different flavors of RS-type instructions, namely "extended" (AM=0) vs "indexed" (AM=1) instructions — that could in principle cause a mismatch between object code generated by ASM101S vs the original AP-101S assembler, though hopefully not any behavioral difference at runtime other than slight timing discrepancies. This group of instructions includes, among other things, all conditional-branch instructions and their aliases.

The greatest difficulty relates to the fact that certain instruction mnemonics are used both for SRS-type instructions and RS-type instructions. Moreover, while some of the operand patterns for them are accepted for SRS instructions and not RS instructions, thus allowing the assembler to distinguish between them, some of the operand patterns nevertheless overlap. In case of overlap, there is no syntactic way for the assembler to distinguish between the SRS instruction and the RS instruction. Overlap occurs for the following syntactical patterns (where R1, D2, and B2 refer to the names of fields in the encoded machine instruction):

OPCODE	R1,D2
OPCODE	R1,D2(B2)

The vulnerable opcode mnemonics are:

A AE AH BC C CH D DE IAL L LA LE LH M ME MH N O S SE SH SHW ST STH TD TH X ZH

While the SRS-type and RS-type instructions are (almost) behaviorally identical, they are encoded differently as machine instructions, and in particular require different amounts of memory to do so. SRS-type instructions are encoded as half-words (2 bytes), while RS-type instructions are encoded as full words (4 bytes). For example, there is no syntactical way to know whether to encode the load-instruction "

L
        4,SWITCH

" as 2 bytes or as 4 bytes. So if ASM101S were to encode an instruction as SRS while the original assembler were to encode it as RS, or vice-versa, then not only would the binary forms of those particular instructions differ, but all of the code following that instruction in the same control section would be aligned differently.

Aside: I say that the SRS-types instruction and RS-type instructions are "almost" behaviorally identical. Figure 2-2 of the AP-101S Principles of Operation tells us about one difference. When an SRS-type instruction specifies a base register (B2) equal to 3, it means to really use general-register 3, as one would expect. Whereas when an RS-type instruction specifies base register 3, it means instead to use "no" base register.

What does it mean to use "no" base register? Several pages later, we're told that "When B2 equals 11, base addressing is not performed. In this case, the displacement is instead used directly as the effective address." In other words, the displacement is the number of halfwords, within the same CSECT, from the instruction following the RS instruction to the target location.

As I said, there is no syntactic way for the assembler to distinguish between these cases, but there is a non-syntactic way based on the size of the D2 sub-operand. If D2 is in the numerical range 0-55, then the SRS instruction could potentially be used, while if D2 is 56 or greater, the RS instruction must be used. Unfortunately, determining these displacements between two locations is quite tricky, because there might be some of these SRS/RS instructions with ambiguous sizes in between. In fact, for forward references, there is guaranteed to be at least one such intervening instruction. In other words, we don't know the sizes of the displacements until we know the sizes of the intervening instructions, and we don't know the sizes of the intervening instructions until we know the sizes of all of the displacements.

In so far as the ambiguity between RS-type "extended" instructions vs "indexed" instructions, this is a lesser problem than the SRS-vs-RS problem, because both extended and indexed varieties of the instruction assemble to a fullword, and thus the use of the wrong variation of the instruction results only in a mismatch at that exact memory location rather than a general displacement of all the succeeding memory locations. As with the SRS-vs-RS instructions, there are cases in which the variation in syntax does allow distinguishing the two cases. Roughly speaking, the syntax of extended vs indexed RS-type instructions is

OPCODE		R1,D2(B2)
OPCODE[@][#]	R1,D2(X2,B2)

but since the @, #, and X2 are all optional syntactically, in their absence there's no way to distinguish between extended vs indexed instructions other than the numerical range allowed for the displacement D2. As well, the latter (indexed) of the two cases allows the syntax

OPCODE		R1,D2(X2)

for an even greater potential for confusion. Yay! Fortunately, allowed base registers (B2) for AP-101S are only the CPU general registers 0-3, so if general register 4-7 appear in such an instruction, it's clear that index register X2 is meant rather than base register B2.

In lieu of any foolproof algorithm for resolving these ambiguities, as well as lacking knowledge of whatever algorithm the original AP-101S assembler used, ASM101S cannot be guaranteed to produce the same choices of SRS instructions versus RS instructions made by the original assembler. All I can say is that I've tried to craft ASM101S in such a way as to match when assembling existing Shuttle flight software.

For a clearer explanation, perhaps it's best to refer to the rules of the card game known as Fizzbin which was developed at roughly the same time.

Uninitialized Memory

Although unstated in the original documentation, I believe that the original AP-101S assembler filled uninitialized memory with the halfword pattern 0xC9FB. This happens to be the first halfword of the AP-101S instruction SVC, which as the AP-101S POO explains (p. 9-16), "causes an interruption and a program status word switch". (The second halfword is the operand for SVC.) Whether the consistent use of this particular value is significant or merely a coincidence, I can't say; perhaps the intention was to use this to trap unintentional execution from uninitialized memory.

The reason I suspect that this is the case is that in surviving legacy assembly reports, 0xC9FB is inevitably at locations where there's a gap due to forcing alignment of data to particular boundaries, such as aligning fullword constants to fullword addresses. For example, consider this excerpt from the assembly report for the ACOS module:

					     .
					     .
					     .
0005D 58E0                      182 RET0     SER   F0,F0                                                    00010504
0005E DF4E           004C 0013  183          B     EXIT                                                     00010600
                                184 *                                                                       00010700
0005F C9FB
00060                           185          DS    0F                                                       00010800
00060 413243F7                  186 PI       DC    X'413243F7'    PI                                        00010900
00062 411921FB                  187 PIOV2    DC    X'411921FB'    PI/2                                      00011000
					     .
					     .
					     .

Here, the instruction B EXIT at halfword address 0x0005F is immediately followed by the pseudo-op DS, whose purpose is to align to the next fullword address boundary (i.e., the next even halfword). In the absence of realignment, though, the next halfword address after the branch instruction would have been 0x0005F (which is an odd halfword address). Therefore, the assembler inserts a halfword at 0x0005F, so that DS can appear instead at 0x00060 (which is now an even address address). The value of that halfword is 0xC9FB.

@# Syntax for Second Operands

AP-101S instructions of type RS (such as LH, STH, etc.) accept an operand syntax of which the most-general form is:

OPCODE	R1,D2(X2,B2)

where R1 represents a general-purpose register designated as "operand #1", D2 represents a "displacement" (which may take the form of a number or a program label), X2 represents a general-purpose register (designated as the "index register"), and B2 represents yet another general-purpose register (designated as the "base register").

That said, there is a bewildering number of exceptions, restrictions, and rules for interpreting these different elements. Additionally, each RS-type instruction can be assembled into two separate forms of machine instructions, namely the so-called "extended" and "indexed" forms, as well as sometimes into a third form, the so-called SRS form discussed a couple of sections ago. It's not my intention to explain all of these details — I don't understand quite a few of them myself, anyway! —, and most of them are explained in the AP-101S POO anyway.

But by saying that most of syntactical variations are explained by the POO, at least in spirit, I'm also implying that some of them are not ... and I'd like to supply those missing explanations that pertain to the operation of the assembler. In particular, most of the four syntactical elements mentioned above may be omitted under various circumstances.

Before getting to that, another thing you need to know is that many of these RS-type instructions have related instructions in which the characters "@", "#", or "@#" are suffixed to the mnemonic. For example, I mentioned above that LH and STH are RS-type instructions, but so too are LH@, LH#, LH@#, STH@, STH#, and STH@#. The AP-101S POO explains these variations as follows:

OPCODE (without @ or #) specifies "direct addressing without autoindexing".
OPCODE@ specifies "indirect addressing".
OPCODE# specifies direct addressing with autoindexing.
OPCODE@# specifies indirect addressing with autoindexing.

The omissions of syntactical elements which I'm concerned about here are those of the X2 or B2 elements, resulting in operands like

OPCODE 	R1,D2(B2)
OPCODE@	R1,D2(X2)

Insofar as assembly is concerned, the question that arises is this: If you have an operand syntax like R1,D2(register), is register supposed to be an index register, or is it supposed to be a base register? My inference is that it is supposed to be a base register if there is no @ or # suffix on the mnemonic, but it is supposed to be an index register for the @/#/@# forms of the mnemonic. In the case where register is a base register and no index register is specified, the "extended" form of machine instruction is generated (and it requires no index register); whereas in the case where register is an index register and no base register is specified, the "indexed" form of machine instruction is generated and the base register defaults to general-purpose register 0.

Fixed-Point Data

By "fixed-point" data, I'm referring to data specified in operands of DC pseudo-ops, à la

DC    F'12.345'		(fullword fixed-point)
DC    H'-6.12'		(halfword fixed-point)

Here we have a case in which the usage in AP-101S assembly language, per the AP-101S POO, is entirely at odds with the usage in System/360 BAL, per the System/360 assembly-language manual. In System/360, the non-integer portions of these specifications are discarded (although optional "exponent factors" and "scale factors" can be used to apply binary or decimal shifts to the data before doing so, thus preserving as much significance in the fractional part as may be desired).

For AP-101S, on the other hand, there are two very different cases. First, if the data does not contain a decimal point or exponent, then what is generated for it in memory is simply the 4-byte or 2-byte 2's-complement representation of the integer value.

Second, if the data does contain a decimal point or exponent, then the full value is entirely fractional: i.e., >-1.0 and <1.0. The constant is assembled to a binary value basically by multiplying by 2³¹ (in the case of F'...') or by 2¹⁵ (in the case of H'...'), discarding the fractional portion, and representing the remaining integer in 2's-complement form.

For example,

DC	H'0.625'

generates 0.625×2¹⁵ = 20480 = 0x5000.

It's TBD what should happen if there is a non-zero integer portion, but ASM101S simply caps the generated value to the boundaries of the representable range, which essentially means that any integer portion is transparently dropped.

"Linking" By the Assembler Itself

The AP-101S assembler performed a kind of partial linking of the object code, prior to any linking of separately-compiled modules by the AP-101 link editor. See the discussion on the HALLINK101S page.

Structure of ASM101S, for Maintainers

This section won't be of interest to anybody who merely wishes to use ASM101S. But given that I haven't necessarily been able so far to provide support in ASM101S for every feature of AP-101 assembly language that might be discovered if additional legacy code becomes available in the future, ASM101S may need to be maintained in the future. So it may be worthwhile to provide at least a few notes on how ASM101S is structured internally, in order to facilitate that possible maintenance.

The Python 3 source code for ASM101S is kept in the ASM101S/ folder of the Virtual AGC software tree, and the top-level source-code file is itself called ASM101S.py. There are also a number of additional Python files in that folder which are imported as modules into ASM101S.py or into each other. I should note that there are other Python files in that folder that I find useful, but that are not used as modules in ASM101S.

Assembly proceeds in a sequence of "passes", some of which are designated as passes in the source code, and some of which are not. Here's a brief runthrough of the passes:

"Acquisition pass": In this pass, all AP-101 assembly-language source code for the selected input files is read from files into a Python list called source; there is an entry in source for each line of code or macro definition encountered; those entries are Python dictionaries containing various information about the lines, though some information is added to those dictionaries in later passes. The acquisition pass includes reading the entire macro library, handling all file-inclusions (i.e., all COPY pseudo-ops), and resolving all assembly-language macro operations. For example, all macro invocations are expanded, all conditional assembly is no longer conditional, and so on. This means that at the end of the acquisition pass, no more symbols of the form &SOMETHING remain, nor any "sequence symbols" (i.e., labels of the form .SOMETHING). In AP-101S (or System/360) assembly language, it is a difficult technical feat to deal with lists of macro parameters (both for macro definitions and macro invocations) that are split across continuation cards; the acquisition pass handles this task using TatSu (see "PASS 0" below), joining all continuations in the "operand" field (see below), so that later passes basically ignore all continuation cards. The acquisition pass is integrated into the way ASM101S.py parses command-line options, and occurs whenever an option like --library=SOMETHING or the name of a source-code file is encountered on the command-line. It is not marked in any way as being a "pass". Some of the important fields in the source entries created in this pass for use by later passes are:

"text" — the unmodified text of the original line.
"name" — the label field of the line.
"operator" — the operator field of the line.
"operand" — the operand+comment fields of the line.
"file" — the name of the source-code file containing the line.
"lineNumber" — the line number (starting at 1) the line within its source-code file.
"n" — the index within source.
"empty" — boolean for completely-blank lines
"fullComment" — boolean for full-line comments ("*" in column 1)
"dotComment" — boolean for dot-comments (".*" in columns 1,2)
"macro" — name of the including macro definition, if any
"continues" — boolean for a non-blank continuation field (column 71)
"identification" — contents of the "ident" field (columns 72-80)
"errors" — array of assembler-generated error/warning messages for the line.
"inMacroDefinition" — boolean for the line being part of a macro definition.
"copy" — boolean for the line being present due to a COPY pseudo-op
"printable" — boolean for whether to include the line in the output assembly report.
"depth" — depth of the line within nested macro expansions. (0 means outside of any expansion.)

"PASS 0": (Note: PASS 0, and all remaining passes unless stated otherwise, are implemented within the AP-101 code generator, of which the top-level file is the Python module model101.py.) Lines of assembly-language code are nominally divided into 4 fields: label, operator, operand, comment. It is difficult to separate the comment field from the operand field in an easy, systematic way, so both are kept together in the "operand" field in the source array. Operand fields can be quite complex, to the point where it is easiest to parse them by defining Backus-Naur Form (BNF) rules for allowable operand formats, with different rules applying for different types of operator fields. PASS 0 has the task of using these BNF rules, on an operator-by-operator basis, for parsing the operand fields and removing any add-on comments. The parsed operands are stored in the source entry in a new field, called "ast". There is thus no need for any later passes to perform additional parsing on the operand fields. The custom BNF rules are provided in the Python module fieldParser.py. These custom rules are compiled and made usable by the generally-available Python module called TatSu, which must be installed for ASM101S to function.
"PASS 1": AP-101 assembly language contains an unfortunate ambiguity, in that a number of instruction mnemonics are used in common for several distinct underlying machine instructions. Some of these underlying machine instructions (of type "SRS") are a halfword in length and some (of type "RS") are a fullword in length. Thus it is often not possible to tell merely from the instruction mnemonic how many halfwords an instruction occupies, which in turn makes it impossible in a single pass to reliably determine the addresses or values of all symbols. The purpose of PASS 1 is to make a preliminary determination of all symbol addresses, using the assumption that all ambiguous instructions occupy a fullword.
Optimization pass: This pass resolves the ambiguities in instruction sizes left behind by PASS 1. I.e., in some cases it can determines that an instruction tentatively assigned a fullword length in PASS 1 can be replaced by instead by a halfword instruction. This process does not necessarily catch all possible reductions; its intent is to behave identically to the original assembler rather than to produce optimally-small code. Since the algorithm used by the original assembler is unknown, there is no guarantee that the algorithm of the optimization pass is 100% correct. To a certain extent, the optimization pass also corrects affected symbol addresses as it makes these reductions; however, it should not be assumed that all symbols are correct after the pass, particularly some symbol values that are expressions containing symbols which themselves have been corrected.
"PASS 2", "PASS 3", ...: These passes have the responsibilty of making the final determination of all symbol values. They are all identical algorithmically, and are iterated until no symbol value changes, though PASS 3 is always performed even if there are no symbol changes found in PASS 2. For PASS 3 and beyond, AP-101 object code is generated, or regenerated as the case may be.
Print pass: This pass produces an assembly-listing and object-code file. It is implemented in ASM101S.py.

Some relevant Python modules of interest:

fieldParser.py: As was already mentioned, contains BNF rules for parsing assembly-language operands. As was not mentioned, it also provides various functions for actually using those rules and performing the parsing. In particular:

parserASM(): Given an operand field as a string and the name of a BNF rule, parses the operand. See also the TatSu documentation.
joinOperand(): Joins operand fields that are split across continuation cards.

model101tables.py: As the name indicates, contains numerical tables used by model101.py during AP-101S code generation.
expressions.py: Evaluates expressions. The principal functions provided are:

error(): Given a source entry and an error message, appends the error message to the list of error messages of the source-code line. (This function really has nothing to do with expressions per se, but it's ubiquitous throughout the assembler.)
evalArithmeticExpression(): Evaluates arithmetic expressions involving symbols (either macro-language symbols or symbols like program labels in the symbol table) and arithmetical operations (like +, -, *, /). Can account for symbols that are in different CSECT or which are EXTRN.
evalBooleanExpression()
evalCharacterExpression()

asciiToEbcdic.py: Contains tables for converting ASCII to EBCDIC and vice-versa.
ibmHex.py: Convert Python numbers to/from IBM hexadecimal floating-point format.
readListing.py: Reads a legacy assembly listing, for the purpose of extracting its assembled halfwords and allowing comparison to a new assembly from the same source code.

Afterthought: HLASM

Belatedly, someone pointed out a presentation from 2010 called "Assembler Language as a Higher Level Language: Conditional Assembly and Macro Techniques" by John R. Ehrman of IBM as a possible resource for deciphering conundrums about gaps in the System/360 assembly-language manual vs AP-101S. Admittedly, John's presentation concerns the so-called "High Level Assembler for z/OS, z/VM, and z/VSE", apparently still being sold for IBM mainframes under the name "HLASM". Not being steeped in IBM lore myself, I can only make assumptions as to how how relevant HLASM may be to the AP-101S assembler. There's a lot of HLASM stuff that's definitely not available (or at least never used if it is available) in AP-101S source code, though it's unclear how much of it is flatly inconsistent with AP-101S. IBM itself provides a manual for HLASM that's you can find online.

In brief, I'll say only that this presentation supports some reverse-engineering assertions I've made earlier on this page, while contradicting some facts about AP-101S code. Thus you may or may not find information from HLASM helpful.

This page is available under the Creative Commons No Rights Reserved License
Last modified by Ronald Burkey on 2024-11-29

Table of Contents