(This page and ASM101S itself are under construction.)

Table of Contents

Introduction

ASM101S is our "modern" version of the macro assembler for the assembly language of the IBM AP-101S or AP-101B computers used as "General Purpose Computers" (GPC) in the Space Shuttle.  To the best of my knowledge, the original assembler actually used for Shuttle software development has not survived, so ASM101S is an entirely-new creation.  It is a Python 3 program that should work on any computer having the Python 3 language installed.

Regarding AP-101B vs AP-101S, the AP-101B was used in earlier Shuttle flights, while the AP-101S was used in later flights.  I have almost no documentation for the AP-101B, but believe that insofar as software is concerned, the main difference is that a handful of new instruction types were made available in the AP-101S that hadn't existed in the AP-101B.  Therefore, ASM101S itself makes no distinctions between the two languages:  If you feed in source code using any of the newer instructions, you'll get object code suitable for the AP-101S only.  If you feed in source code not having any of the newer instructions, you'll get object code suitable for both the AP-101S and AP-101B.
Aside:  That's not to say that the differences were insignificant, but merely that the differences are transparent to the assembler.  Here's a summary of some of the known differences:

Feature
AP-101B
AP-101S
Power:
780W
560W
Weight:
117 pounds
64 pounds
Memory:
104K words
(416K bytes)
256K words
(1024K bytes)
Memory Protection
(per 16-bit half-word):
1 parity bit and
1 store-protect bit
6 ECC bits and
3 store-protect bits
Speed:
420K operations/second
>1000K operations/second
Battery backup:
n/a
Rechargeable NiCAD
Built-in test equipment:      
n/a
Temperature; charger; battery; soft error counter
MTBF:
5K hours
24K hours

Because there's little software distinction between the different models in a software sense, I'll generally just refer to the "AP-101" rather than specifying "AP-101S" or "AP-101B".  Realize, though, that there were also models of the IBM AP-101 computer other than the AP-101B or AP-101S, for uses other than the Shuttle, and there's no particular reason to believe that ASM101S could assemble their source code without some updates.

Despite the lack of surviving documentation about the original AP-101 assembler, by examining the surviving AP-101 assembly-language source code I've concluded that the syntax of the AP-101 assembly language must have closely mimicked that of the standard macro assembler for the IBM System/360 Basic Assembly Language (BAL).  This observation has been central to the development of ASM101S.

References

The Master Sequence Controller (MSC)

AP-101 assembly language includes not only pseudo-ops and AP-101 CPU instructions as such, but also instructions for the Master Sequence Controller (MSC). You might categorize the MSC as being a separate computer within the main computer. The CPU and MSC instructions, for these two essentially separate types of computers, are intermixed indiscriminately in AP-101 assembly-language source code and sharing memory.  The POO explains the MSC thusly:
"The Master Sequence Controller (MSC) is a micro programmed computer specifically tailored for I/O Management within the Space Shuttle General Purpose Computer (GPC).  As such, it has extensive and programmable capabilities for monitoring and controlling the basic I/O operations performed by upwards to 24 Bus Control Elements (BCE's) which are implemented in the baseline GPC.  These capabilities include setting up, scheduling, and initiating BCE programs, monitoring the status of BCD operations, and communicating overall completion of these operations to the CPU."
MSC instructions can be distinguished from CPU instructions in that they begin with the character "@".

See Appendix II of the AP-101S POO or the seemingly-identical Part II of the IOP POO for more information.

The Bus Control Elements (BCE)

Similarly, the POO tells us that

"The Bus Control Element (BCE) is a microprogrammed controller specifically tailored for management of I/O traffic on one of the Space Shuttle system busses.  Within each IOP [Input/Output Processor] there is one BCE for each system bus, for a total of 24 BCE's.  Each of these BCE's is capable of independent program execution, data buffering to and from memory, and communication with the MSC."
Or in other words, besides the MSC discussed in the preceding section, there are 24 additional processors within the AP-101, yet again with their own distinct instruction set, yet again sharing memory and intermixed in the assembly-language source code with CPU instructions and MSC instructions.

BCE instructions can be distinguished from CPU and MSC instructions in that they begin with the character "#".

See Appendix III of the AP-101S POO or the seemingly-identical Part III of the IOP POO for more information.

ASM101S

(To be clear, ASM101S is not entirely functional.  I am simply documenting it as I proceed with development.  Therefore, some of what's described might not be available yet, though of course where I show specific examples of output from the assembler, it is functional enough to produce the results shown.)

The modern assembler is invoked as
ASM101S.py [OPTIONS] SOURCE1.asm SOURCE2.asm ...
Some of the more-significant available OPTIONS are:
Since you are allowed to specify multiple assembly-language source-code files on the command lines, it's possible in principle to assemble an entire program at once.  In practice, it would be more usual to assemble each source-code file separately, and then to use a linker to merge all of those object-code files into a single AP-101S load file for the program as a whole.

You can see a list (and explanations!) of all of the available command-line OPTIONS by using the command
ASM101S.py --help
The action of the assembler is roughly the following:
  1. (Optionally.)  Loads the macro definitions found in the assembly-language source-code files (*.asm) found in the LIBRARY folder.  (More on this is in the next section.)
  2. Load all of the SOURCEx.asm files specified on the command line, in the order that they are specified.  Macro definitions should precede any source-code files using those macros, but the ordering of macro files among themselves is not significant.
  3. Process the entire set of source code which has been loaded.
  4. Output an assembly report (on stdout) and if there were no fatal errors, an AP-101 object-code file that contains the results of the assembly process.
Recognize that at least as far as Space Shuttle flight software is concerned, the full flight software consists of a large number of assembly language modules, along with even more modules written in the high-level language HAL/S.  So as noted above in a slightly-different way, the object file emitted by the assembler is unlikely to be a full executable program — or load file in this particular jargon — but rather just one of many object files which need to be linked by an AP-101 linker utility before becoming an executable program.

Beyond the assembler itself, there is also a bash shell script — useful in Linux or Mac OS, but usually not in Windows since the bash command shell isn't commonly available in Windows without extra effort on your part — that performs a kind of regression test on ASM101S by assembling legacy AP-101S assembly-language source-code files for which a contemporary assembly listing made by the original AP-101S assembler has survived, and performing a byte-by-byte comparison of ASM101S's assemblies vs the original assemblies.  It is run simply as
regressionASM101S.sh
At this writing, 205 source-code files are included in the regression test, comprising around 17K lines of source code.  Yes, that's a relatively small test, but I'm not responsible for the lack of surviving test material.  Unfortunately, I cannot claim that ASM101S is a fast program, so this process is not as fast as I might hope, nor are status messages provided as long as the tests are successful, so you may find yourself confused that nothing appears to be happening.  If you run the test instead as
regressionASM101S.sh -v
it will at least display the names of the files it is checking.

A Brief Tour of the Assembly Process

As a more-concrete example than the abstract description in the preceding section, consider assembly of the "ACOS" module from the AP-101S runtime library of the HAL/S compiler.  We'll learn more details about the runtime library and what's required for assembling it in the sections that follow.  For our purposes at the moment, I'll content myself by saying that if you follow the installation instructions (given later) you can perform this assembly with these steps:

cd "virtualagc/yaShuttle/Source Code/PASS.REL32V0/RUNASM"
ASM101S.py --library=../RUNMAC ACOS.asm

(Or at least, that's how you'd do it in Linux or Mac OS.  In Windows you'd have to use '\' in place of '/'.)

Here's what the AP-101S assembly-language source code for the ACOS module looks like prior to assembly:

 TITLE 'ACOS -- SINGLE PRECISION INVERSE SINE-COSINE FUNCTION'          00000100
*
* WHEN DATA_REMOTE IS IN EFFECT, THE LDM INSTRUCTIONS ARE NEEDED
* AROUND THE CALL TO THIS RTL ROUTINE BECAUSE R1/R3 IS USED AS
* A BASE REGISTER. (CR12620)
*
* ACOS: INVERSE SINE-COSINE(SINGLE) 00000200
* 00000300
* 1. INPUT AND OUTPUT VIA F0. 00000400
* 2. FOR 0<=X<=1/2, COMPUTE ASIN(X) BY A RATIONAL FUNCTION. 00000500
* 3. FOR 1/2<X<=1, USE ASIN(X)=PI/2-2*ASIN(SQRT((1-X)/2)). 00000600
* 4. FOR X<0, USE ASIN(X)=-ASIN(-X). 00000700
* 5. ACOS(X)=PI/2-ASIN(X). 00000800
* 6. ERROR GIVEN IF |X|>1. 00000900
* 00001000
*IBM CR #SS8244- RETURN PI FOR ARCCOS ARGUMENT < -1; 8/14/81 A.CLOSE 00001010
* REGISTER R4 IS USED AS A SWITCH- 00001100
* BIT 0 IS USED TO SIGNAL WHETHER THIS PROCEDUER IS 00001200
* CALLED AS ACOS(ON) OR ASIN(OFF). BIT 1 IS USED TO SIGNAL 00001300
* IF ARG. IS POS.(OFF) OR NEG.(ON).BIT 2 IS USED TO SIGNAL 00001400
* ABS. VALUE OF ARG.,IF GREATER THAN 1/2(ON) ELSE (OFF) 00001500
* 00001600
MACRO 00001700
WORKAREA 00001800
SAVE6 DS D TO SAVE REGISTERS F6,F7 00001900
SWITCH DS F TO SAVE R4 ACROSS INTRINSIC CALL 00002000
MEND 00002100
ACOS AMAIN ACALL=YES 00002200
*COMPUTES ARC-COSINE(X) OF SINGLE PRECISION SCALAR 00002300
INPUT F0 SCALAR SP 00002400
OUTPUT F0 SCALAR SP RADIANS 00002500
WORK R1,R2,R3,R4,R5,R6,R7,F1,F2,F3,F4,F6 00002600
SR R4,R4 SIGNAL ACOS ENTRY 00002700
OHI R4,X'8000' 00002800
B MERGE 00002900
ASIN AENTRY 00003000
* COMPUTES ARC-SINE(X) OF SINGLE PRECISION SCALAR 00003100
INPUT F0 SCALAR SP 00003200
OUTPUT F0 SCALAR SP RADIANS 00003300
WORK R2,R3,R4,F2,F3,F4,F6 00003400
SR R4,R4 SIGNAL ASIN ENTRY 00003500
* 00003600
MERGE STED F6,SAVE6 SAVE REGISTERS F6,F7 00003700
LER F6,F0 GET ARGUMENT IN F6 00003800
BNM POS AND TEST SIGN 00003900
OHI R4,X'4000' SIGNAL NEG. ARG. 00004000
LECR F0,F0 GET |X| 00004100
* 00004200
POS CE F0,HALF SKIP TO 'SMALL' IF 00004300
BNH SMALL |X|<=1/2 00004400
* 00004500
* Z=SQRT((1-|X|)/2) IF |X|^>|1/2 00004600
* 00004700
OHI R4,X'2000' SIGNAL |X|>1/2 00004800
LECR F0,F0 00004900
AE F0,ONE 1-|X| IN F0 00005000
BNP EXPRESS BRANCH IF |X|>=1 00005100
* 00005200
LER F6,F0 00005300
ME F6,HALF Z**2 IN F6 00005400
AER F0,F0 4*Z**2 IN F0 00005500
* 00005600
ST R4,SWITCH SAVE R4 ACROSS INTRINSIC CALL 00005700
ABAL SQRT GET 2Z IN F0 BY CALL TO 00005800
L R4,SWITCH RESTORE R4 00005900
B POLY SQRT, AND RETURN TO POLY 00006000
* 00006100
SMALL CE F0,UNFLO 00006200
BNH TESTS 00006300
MER F6,F6 X**2 IN F6 00006400
* 00006500
* COMPUTE ASIN(Z) HERE BY RATIONAL FUNCTION 00006600
* 00006700
POLY LE F4,C2 00006800
AER F4,F6 00006900
LE F2,D2 00007000
DER F2,F4 (C2+Z**2)/D2 00007100
AE F2,C1 00007200
AER F2,F6 C1+Z**2+(C2+Z**2)/D2 00007300
ME F6,D1 00007400
DER F6,F2 00007500
LE F3,ROUND ROUNDING NUMBER IN F3, 00007600
LER F2,F0 AS LOW HALF OF 2Z. 00007700
MER F0,F6 00007800
AEDR F0,F2 ROUNDING OCCURS HERE 00007900
* 00008000
* REVERSE REDUCTIONS TO GIVE ACTUAL ANSWER 00008100
* 00008200
TESTS TRB R4,X'A000' IF ACOS ENTRY 00008300
BNM TSTNEG OR |X|>1/2 BUT NOT BOTH, 00008400
LECR F0,F0 SUBTRACT RESULT 00008500
AED F0,PIOV2 FROM PI/2 00008600
* 00008700
TSTNEG TRB R4,X'4000' COMPLEMENT 00008800
BNO EXIT IF ARGUMENT WAS NEGATIVE. 00008900
LER F0,F0 CHECK FOR ARG=0(HARDWARE BUG) 00009000
BZ NOLECR 00009100
LECR F0,F0 00009200
NOLECR TRB R4,X'C000' AND ADD PI TO 00009300
BNO EXIT RESULT IF ACOS ENTRY 00009400
AE F0,PI AND ARGUMENT<0. 00009500
* 00009600
EXIT LED F6,SAVE6 RESTORE F6,F7 00009700
AEXIT AND EXIT 00009800
* 00009900
EXPRESS BZ TESTS SKIP POLYNOMIAL IF |X|=1 00010000
AERROR 10 |ARG|>1 00010100
LE F0,PIOV2 00010200
TRB R4,X'8000' ASIN CALL? 00010300
BZ TSTNEG YES, CHECK SIGN 00010400
TRB R4,X'4000' X=-1? 00010500
BZ RET0 NO,RETURN 0 00010501
LE F0,PI YES, RETURN PI 00010502
B EXIT 00010503
RET0 SER F0,F0 00010504
B EXIT 00010600
* 00010700
DS 0F 00010800
PI DC X'413243F7' PI 00010900
PIOV2 DC X'411921FB' PI/2 00011000
ROUND DC X'5FFFFFFF' 00011100
UNFLO DC X'3E100000' 16**(-3) 00011200
ONE DC X'41100000' 1.0 00011300
HALF DC X'40800000' 0.5 00011400
D1 DC X'C08143C7' -0.5049404 00011500
C1 DC X'C13B446A' -3.7042025 00011600
D2 DC X'C11406BF' -1.2516474 00011700
C2 DC X'C11DB034' -1.8555182 00011800
ACLOSE 00011900

Simple, right?  Well ... perhaps not so much!  In fact, it's more complex than it appears at first sight.  You'll notice that strewn throughout the source code are various "instructions" that I've highlighted in red to make them stand out.  These are actually invocations of "macros" (more on this below), each of which may be expanded into multiple lines of source code at assembly-time.  Or perhaps not:  The INPUT macro, for example, in this example turns out to "expand" to nothing at all.  In contrast, other of the macros may themselves invoke other macros, which may in turn invoke other macros, and so on.  For example, the ACLOSE macro appearing at the bottom of the listing above invokes the macro ERRPARMS.  The macros themselves (in this example) provide the machinery needed to interface assembly-language subroutines to HAL/S code calling those subroutines, but macros can serve many other purposes in other source-code files.

Not that we're in a position to understand it fully without a lot of study, here's what the source code for that very same macro ACLOSE looks like:

         MACRO                                                          00000100
ACLOSE 00000200
GBLA &ENTCNT 00000300
GBLB &INPUT(20),&OUTPUT(20) 00000400
GBLC &NAMES(20) 00000500
ERRPARMS 00000600
&I SETA 1 00000700
.LOOP AIF (&INPUT(&I)).INOK 00000800
MNOTE 1,'INPUT NOT SPECIFIED FOR &NAMES(&I)' 00000900
.INOK AIF (&OUTPUT(&I)).OUTOK 00001000
MNOTE 1,'OUTPUT NOT SPECIFIED FOR &NAMES(&I)' 00001100
.OUTOK ANOP 00001200
&I SETA &I+1 00001300
AIF (&I LE &ENTCNT).LOOP 00001400
END 00001500
MEND 00001600

I've highlighted this entirely in red because even though many macro definitions include actual AP-101S instructions and pseudo-ops, this particular macro definition consists entirely of statements in the macro language.  About the only thing here that's immediately understandable — and only because I mentioned it earlier! — is that ERRPARMS is the invocation of another macro.

Now that we recognize that the assembly listing of ACOS may not look too much like the original source code, the following is an excerpt from the assembly report produced when ASM101S assembles ACOS.  In the interest of saving a little space, I've removed such assembler-generated items as the symbol table.  At the same time, I've  highlighted the macro expansions in red:

         ACOS -- SINGLE PRECISION INVERSE SINE-COSINE FUNCTION                                                 PAGE    2
LOC OBJECT CODE ADR1 ADR2 SOURCE STATEMENT ASM101S 0.00 10/20/24
2 *
3 * WHEN DATA_REMOTE IS IN EFFECT, THE LDM INSTRUCTIONS ARE NEEDED
4 * AROUND THE CALL TO THIS RTL ROUTINE BECAUSE R1/R3 IS USED AS
5 * A BASE REGISTER. (CR12620)
6 *
7 * ACOS: INVERSE SINE-COSINE(SINGLE) 00000200
8 * 00000300
9 * 1. INPUT AND OUTPUT VIA F0. 00000400
10 * 2. FOR 0<=X<=1/2, COMPUTE ASIN(X) BY A RATIONAL FUNCTION. 00000500
11 * 3. FOR 1/2<X<=1, USE ASIN(X)=PI/2-2*ASIN(SQRT((1-X)/2)). 00000600
12 * 4. FOR X<0, USE ASIN(X)=-ASIN(-X). 00000700
13 * 5. ACOS(X)=PI/2-ASIN(X). 00000800
14 * 6. ERROR GIVEN IF |X|>1. 00000900
15 * 00001000
16 *IBM CR #SS8244- RETURN PI FOR ARCCOS ARGUMENT < -1; 8/14/81 A.CLOSE 00001010
17 * REGISTER R4 IS USED AS A SWITCH- 00001100
18 * BIT 0 IS USED TO SIGNAL WHETHER THIS PROCEDUER IS 00001200
19 * CALLED AS ACOS(ON) OR ASIN(OFF). BIT 1 IS USED TO SIGNAL 00001300
20 * IF ARG. IS POS.(OFF) OR NEG.(ON).BIT 2 IS USED TO SIGNAL 00001400
21 * ABS. VALUE OF ARG.,IF GREATER THAN 1/2(ON) ELSE (OFF) 00001500
22 * 00001600
23 MACRO 00001700
24 WORKAREA 00001800
25 SAVE6 DS D TO SAVE REGISTERS F6,F7 00001900
26 SWITCH DS F TO SAVE R4 ACROSS INTRINSIC CALL 00002000
27 MEND 00002100
28 ACOS AMAIN ACALL=YES 00002200
29+*********************************************************************** 01-AMAIN
30+* 01-AMAIN
31+* PRIMARY ENTRY POINT 01-AMAIN
32+* 01-AMAIN
33+*********************************************************************** 01-AMAIN
00000 34+ACOS CSECT 01-AMAIN
00000 35+STACK DSECT 01-AMAIN
36+* DS 18H STANDARD STACK AREA DEFINITION 01-AMAIN
00000 37+ DS F PSW (LEFT HALF) 01-AMAIN
00002 38+ DS 2F R0,R1 01-AMAIN
00006 39+ARG2 DS F R2 01-AMAIN
00008 40+ DS F R3 01-AMAIN
0000A 41+ARG4 DS F R4 01-AMAIN
0000C 42+ARG5 DS F R5 01-AMAIN
0000E 43+ARG6 DS F R6 01-AMAIN
00010 44+ARG7 DS F R7 01-AMAIN
45+* END OF STANDARD STACK AREA 01-AMAIN
00012 46+SAVE6 DS D TO SAVE REGISTERS F6,F7 02-WORKAREA
00016 47+SWITCH DS F TO SAVE R4 ACROSS INTRINSIC CALL 02-WORKAREA
00018 48+STACKEND DS 0F END OF COMBINED STACK AREA 01-AMAIN
00000 49+ACOS CSECT 01-AMAIN
50+ USING STACK,0 ADDRESS STACK AREA 01-AMAIN
00000 E0FB 0018 0018 51+ IAL 0,STACKEND-STACK SET STACK SIZE 01-AMAIN
00002 B624 0000 0009 0000 52+ NIST 9(0),0 CLEAR ON ERROR INFO (LCL DATA PTR) 01-AMAIN

54 *COMPUTES ARC-COSINE(X) OF SINGLE PRECISION SCALAR 00002300
55 INPUT F0 00002400
56+F0 EQU 0 01-INPUT
58 OUTPUT F0 00002500
60 WORK R1,R2,R3,R4,R5,R6,R7,F1,F2,F3,F4,F6 00002600
61+R1 EQU 1 01-WORK
62+R2 EQU 2 01-WORK
63+R3 EQU 3 01-WORK
64+R4 EQU 4 01-WORK
65+R5 EQU 5 01-WORK
66+R6 EQU 6 01-WORK
67+R7 EQU 7 01-WORK
68+F1 EQU 1 01-WORK
69+F2 EQU 2 01-WORK
70+F3 EQU 3 01-WORK
71+F4 EQU 4 01-WORK
72+****** WARNING: F6 MUST BE PRESERVED ACROSS CALLS 01-WORK
73+F6 EQU 6 01-WORK

00004 0CE4 75 SR R4,R4 SIGNAL ACOS ENTRY 00002700
00005 B2E4 8000 8000 76 OHI R4,X'8000' 00002800
00007 DF14 000D 0005 77 B MERGE 00002900
78 ASIN AENTRY 00003000
79+*********************************************************************** 01-AENTRY
80+* 01-AENTRY
81+* SECONDARY ENTRY POINT 01-AENTRY

------------------------------------------------------------------------------------------------------------------------
ACOS -- SINGLE PRECISION INVERSE SINE-COSINE FUNCTION PAGE 3
LOC OBJECT CODE ADR1 ADR2 SOURCE STATEMENT ASM101S 0.00 10/20/24
82+* 01-AENTRY
83+*********************************************************************** 01-AENTRY
00008 84+ASIN DS 0H 01-AENTRY
85+ ENTRY ASIN 01-AENTRY
00008 E0FB 0018 0018 86+ IAL 0,STACKEND-STACK SET STACK SIZE 01-AENTRY
0000A B624 0000 0009 0000 87+ NIST 9(0),0 CLEAR ERROR VECTOR POINTER 01-AENTRY

89 * COMPUTES ARC-SINE(X) OF SINGLE PRECISION SCALAR 00003100
90 INPUT F0 00003200
92 OUTPUT F0 00003300
94 WORK R2,R3,R4,F2,F3,F4,F6 00003400
95+****** WARNING: F6 MUST BE PRESERVED ACROSS CALLS 01-WORK
0000C 0CE4 97 SR R4,R4 SIGNAL ASIN ENTRY 00003500
98 * 00003600
0000D 3EF8 0012 0012 99 MERGE STED F6,SAVE6 SAVE REGISTERS F6,F7 00003700
0000F 7EE0 100 LER F6,F0 GET ARGUMENT IN F6 00003800
00010 DD0C 0014 0003 101 BNM POS AND TEST SIGN 00003900
00011 B2E4 4000 4000 102 OHI R4,X'4000' SIGNAL NEG. ARG. 00004000
00013 78E8 103 LECR F0,F0 GET |X| 00004100
104 * 00004200
00014 48FF 0054 006A 0054 105 POS CE F0,HALF SKIP TO 'SMALL' IF 00004300
00016 DE3C 0026 000F 106 BNH SMALL |X|<=1/2 00004400
107 * 00004500
108 * Z=SQRT((1-|X|)/2) IF |X|^>|1/2 00004600
109 * 00004700
00017 B2E4 2000 2000 110 OHI R4,X'2000' SIGNAL |X|>1/2 00004800
00019 78E8 111 LECR F0,F0 00004900
0001A 50F7 004C 0068 004C 112 AE F0,ONE 1-|X| IN F0 00005000
0001C DEC8 004F 0032 113 BNP EXPRESS BRANCH IF |X|>=1 00005100
114 * 00005200
0001D 7EE0 115 LER F6,F0 00005300
0001E 66F7 004A 006A 004A 116 ME F6,HALF Z**2 IN F6 00005400
00020 50E0 117 AER F0,F0 4*Z**2 IN F0 00005500
118 * 00005600
00021 342C 0016 119 ST R4,SWITCH SAVE R4 ACROSS INTRINSIC CALL 00005700
120 ABAL SQRT 00005800
121+ EXTRN #QSQRT 01-ABAL
00022 E4F7 122+ DC X'E4F7' CALL INTRINSIC INDIRECTLY 01-ABAL
00023 3800 123+ DC Y(#QSQRT+X'3800') THROUGH QCON 01-ABAL

00024 1C2C 0016 124 L R4,SWITCH RESTORE R4 00005900
00025 DF10 002A 0004 125 B POLY SQRT, AND RETURN TO POLY 00006000
126 * 00006100
00026 48FF 003E 0066 003E 127 SMALL CE F0,UNFLO 00006200
00028 DE48 003B 0012 128 BNH TESTS 00006300
00029 66E6 129 MER F6,F6 X**2 IN F6 00006400
130 * 00006500
131 * COMPUTE ASIN(Z) HERE BY RATIONAL FUNCTION 00006600
132 * 00006700
0002A 7CF7 0046 0072 0046 133 POLY LE F4,C2 00006800
0002C 54E6 134 AER F4,F6 00006900
0002D 7AF7 0041 0070 0041 135 LE F2,D2 00007000
0002F 6AE4 136 DER F2,F4 (C2+Z**2)/D2 00007100
00030 52F7 003C 006E 003C 137 AE F2,C1 00007200
00032 52E6 138 AER F2,F6 C1+Z**2+(C2+Z**2)/D2 00007300
00033 66F7 0037 006C 0037 139 ME F6,D1 00007400
00035 6EE2 140 DER F6,F2 00007500
00036 7BF7 002C 0064 002C 141 LE F3,ROUND ROUNDING NUMBER IN F3, 00007600
00038 7AE0 142 LER F2,F0 AS LOW HALF OF 2Z. 00007700
00039 60E6 143 MER F0,F6 00007800
0003A 50EA 144 AEDR F0,F2 ROUNDING OCCURS HERE 00007900
145 * 00008000
146 * REVERSE REDUCTIONS TO GIVE ACTUAL ANSWER 00008100
147 * 00008200
0003B B3E4 A000 A000 148 TESTS TRB R4,X'A000' IF ACOS ENTRY 00008300
0003D DD0C 0041 0003 149 BNM TSTNEG OR |X|>1/2 BUT NOT BOTH, 00008400
0003E 78E8 150 LECR F0,F0 SUBTRACT RESULT 00008500
0003F 50FF 0021 0062 0021 151 AED F0,PIOV2 FROM PI/2 00008600
152 * 00008700
00041 B3E4 4000 4000 153 TSTNEG TRB R4,X'4000' COMPLEMENT 00008800
00043 DE20 004C 0008 154 BNO EXIT IF ARGUMENT WAS NEGATIVE. 00008900
00044 78E0 155 LER F0,F0 CHECK FOR ARG=0(HARDWARE BUG) 00009000
00045 DC04 0047 0001 156 BZ NOLECR 00009100
00046 78E8 157 LECR F0,F0 00009200
00047 B3E4 C000 C000 158 NOLECR TRB R4,X'C000' AND ADD PI TO 00009300
00049 DE08 004C 0002 159 BNO EXIT RESULT IF ACOS ENTRY 00009400
0004A 50F7 0014 0060 0014 160 AE F0,PI AND ARGUMENT<0. 00009500
161 * 00009600
------------------------------------------------------------------------------------------------------------------------
ACOS -- SINGLE PRECISION INVERSE SINE-COSINE FUNCTION PAGE 4
LOC OBJECT CODE ADR1 ADR2 SOURCE STATEMENT ASM101S 0.00 10/20/24
0004C 7EF8 0012 0012 162 EXIT LED F6,SAVE6 RESTORE F6,F7 00009700
163 AEXIT AND 00009800
164+*********RETURN TO CALLER********************************************** 01-AEXIT
0004E 165+ DS 0H 01-AEXIT
0004E 97E8 166+$RET1 SRET 7,0 RETURN TO CALLER 01-AEXIT
167+*********************************************************************** 01-AEXIT

169 * 00009900
0004F DC56 003B 0015 170 EXPRESS BZ TESTS SKIP POLYNOMIAL IF |X|=1 00010000
171 AERROR 10 00010100
172+*********ISSUE SEND ERROR SVC****************************************** 01-AERROR
00050 C9FB 0074 0074 173+ SVC AERROR1 ISSUE SEND ERROR SVC 01-AERROR
174+*********SEND ERROR SVC RETURNS CONTROL FOR STANDARD FIXUP************* 01-AERROR

00052 78F7 000E 0062 000E 175 LE F0,PIOV2 00010200
00054 B3E4 8000 8000 176 TRB R4,X'8000' ASIN CALL? 00010300
00056 DC5A 0041 0016 177 BZ TSTNEG YES, CHECK SIGN 00010400
00057 B3E4 4000 4000 178 TRB R4,X'4000' X=-1? 00010500
00059 DC0C 005D 0003 179 BZ RET0 NO,RETURN 0 00010501
0005A 78F7 0004 0060 0004 180 LE F0,PI YES, RETURN PI 00010502
0005C DF46 004C 0011 181 B EXIT 00010503
0005D 58E0 182 RET0 SER F0,F0 00010504
0005E DF4E 004C 0013 183 B EXIT 00010600
184 * 00010700
00060 185 DS 0F 00010800
00060 413243F7 186 PI DC X'413243F7' PI 00010900
00062 411921FB 187 PIOV2 DC X'411921FB' PI/2 00011000
00064 5FFFFFFF 188 ROUND DC X'5FFFFFFF' 00011100
00066 3E100000 189 UNFLO DC X'3E100000' 16**(-3) 00011200
00068 41100000 190 ONE DC X'41100000' 1.0 00011300
0006A 40800000 191 HALF DC X'40800000' 0.5 00011400
0006C C08143C7 192 D1 DC X'C08143C7' -0.5049404 00011500
0006E C13B446A 193 C1 DC X'C13B446A' -3.7042025 00011600
00070 C11406BF 194 D2 DC X'C11406BF' -1.2516474 00011700
00072 C11DB034 195 C2 DC X'C11DB034' -1.8555182 00011800
196 ACLOSE 00011900
197+ LTORG 02-ERRPARMS
198+****************ERROR PARAMETER AREA*********************************** 02-ERRPARMS
00074 199+#LACOS CSECT 02-ERRPARMS
200+**** ACOS SENDS THE FOLLOWING ERROR 02-ERRPARMS
202+**** ERROR NUMBER 10 IN GROUP 4 02-ERRPARMS
00074 0014 204+AERROR1 DC H'20' SVC CODE FOR SEND ERROR 02-ERRPARMS
00075 040A 205+ DC Y(4*256+10) 8 BIT GROUP AND NUMBER 02-ERRPARMS
206+****************END OF ERROR PARAMETER AREA**************************** 02-ERRPARMS
207+ END 01-ACLOSE

As I hinted earlier, one nice feature (for us!) of the HAL/S AP-101S runtime library is that we have contemporary assembly-listings created by the original assembler, which is of great help in verifying that ASM101S produces correct results.  ASM101S tries to mimic the assembly listings produced by that original assembler, but not obsessively so.  If you're interested — and if you have a really big display or multiple monitors! — clicking this link hopefully opens up the original assembly listing in a new window or tab (depending on your browser's configuration), so that you can visually compare the contemporary and new listings side by side.

Macro-Library Folders

AP-101 assembly language is a macro language.  The Shuttle software developers made constant and frequent use of these macro capabilities, to the point where it's difficult even to find an AP-101 assembly-language file that is not dependent on macros.  Thus, we have to understand various things about this macro capability from the very beginning, rather than concentrating merely on translation of AP-101 instructions into object code and leaving the topic of macros for future consideration.  You can read about it for yourself in the IBM 360 assembler language manual, in which the discussion of macros fully occupies the entire 2nd half of that document.  Learning the syntax and other usage details of macros is as important for understanding the flight software, or perhaps even more important, and likely more difficult, than understanding the AP-101 instruction set.

But I won't talk here about the technical details of macro definitions or expansions.  Rather, here I merely need to talk about the organizational principles of the associated source-code files.

When assembling an assembly-language file whose code depends on macros, there are three different ways in which the definitions of those macros may be made available to the code using them:

  1. Macro definitions may explicitly appear within the assembly-language file in which they're used, in which case the scope of those macro definitions is that file alone.
  2. Macro definitions may appear in a "macro library", and the macros defined in the library are accessible to any assembly-language file using that library.  Speaking generally, the HAL/S compiler's runtime library (which is written in AP-101 assembly language) has a macro library known as RUNMAC, whereas the Shuttle flight software has a macro library known as MLIB80.  But each release of the flight software has a version of MLIB80 specific to that release.
  3. Macro definitions may appear, along with non-macro code, in AP-101 assembly-language files "included" by other AP-101 assembly-language files via an assembly-language pseudo-op called COPY.  (Note: COPY'd files supposedly cannot contain macro definitions in System/360, but can do so in AP-101.)
In the current implementation of ASM101S, macro libraries are literally just folders of assembly-language files.  Recall from the preceding section that ASM101S is invoked with one or more optional command-line arguments of the form --library=LIBRARYLIBRARY is just a path to a macro-library folder.  Macro definitions in any libraries specified in this manner are loaded by the assembler along with the specific source-code file(s) being assembled, thus automatically making all of the macro definitions in that library available during the assembly process.

Relative to where AP-101 assembly-language source code is stored in the source-code tree, ASM101S would typically be used with either the option --library=../RUNMAC or --library=../MLIB80, assuming the current working directory was the one storing the source-code files being assembled.

There's a slight problem, though, in that for some reason, all of the assembly-language files included via COPY pseudo-ops are also located within the macro libraries, intermixed with the files intended to contain only macros.  But we do not want any of the code from these COPY'able files (even if there are some macro definitions within them) to be automatically be made available during assembly.  Rather, we want their code to be made available only when they're COPY'd!  Or to put it differently, categories #2 and #3 of files containing macro definitions, as discussed above, must be mutually exclusive.

It's unclear to my why the Shuttle developers chose to house these mutually-exclusive categories of files together in the same directory — or as they thought of it, the same "Partitioned Data Set" (PDS) —, nor how they handled this ambiguity in their assembler.  As for ASM101S, though, it handles the ambiguity as follows:

ASM101S does not attempt to determine these distinctions for itself.  Rather, the files in the macro library (or libraries) must have been preprocessed in such a manner as to determine which of the two categories each file in the library falls into.  Each macro library is assumed to contain a file called MACROFILES.txt containing this information, and ASM101S simply uses the categorization provided by MACROFILES.txt.  The format of MACROFILES.txt is that it lists the names all of the macro-definition files, one per line.  Full-line comments (having a semicolon in column 1) are also allowed.

Aside:  A utility program (makeMACROFILES.py) is provided to create MACROFILES.txt.  Admittedly, insofar as legacy code related to Shuttle flight software is concerned, this is probably of little interest to you, the end user, since all such preprocessing is likely to have been performed prior to you seeing any of the assembly-language source-code files anyway.  But if you do happen to acquire flight software or other AP-101 software from sources other than Virtual AGC — send it to me! — then I suppose you might need to do the preprocessing yourself.

Installation of ASM101S

For Linux, Mac OS, or Windows.  If the HAL/S compiler (HALSFC) has been installed per the instructions, then ASM101S will automatically be available as well.

If for some inexplicable reason you want to have ASM101S just for itself, without the HAL/S assembler (or any of the AP-101 source-code files) provided by the normal installation, you could instead just download the file ASM101S.py.  You simply need Python 3 to run it.

Aside:  If you choose the latter installation method, I can only assume that you already have some AP-101 source-code files that you want to assemble.  You might consider sending them to me.

Where Are the AP-101 Assembly-Language Files?

AP-101 assembly-language source-code files can also be obtained somewhat indirectly by the trick of compiling HAL/S source-code files, and then extracting assembly language from the reports produced by the HAL/S compiler's code-generation pass.  The page covering the HAL/S compiler shows you some examples of how to run the HAL/S compiler, but to summarize it briefly, suppose you have a HAL/S source-code file called SOURCE.hal in the current working directory.  To compile it, you might use the command

HALSFC SOURCE.hal "" "LIST"

This operation creates a new folder containing the results of the compilation, as well as all of the intermediate files HALSFC creates during the compilation process.  This folder will be the newest one with a name of the form "*.results", such as "HALSFC Wed Aug 21 07:14:36 AM CDT 2024.results".  The file that's of interest to us in that folder will be the one called "pass2.rpt".

The pass2.rpt file cannot be directly assembled by ASM101S, because it contains a lot of stuff other than just assembly-language source code.  But a script called "extractAP101S.py" has been provided that can extract just the AP-101S assembly language from pass2.rpt into a file that can indeed be directly assembled:

extractAP101S.py <pass2.rpt >SOURCE.asm
ASM101S SOURCE.asm

It happens that pass2.rpt itself is structured much like the assembly listings produced by ASM101S, in that it includes (among other things) not only the source code but also the binary form of the object code and some tables.  This similarity can be exploited as an additional validity test for ASM101S, if you're as inclined to doubt its validity as I am.

Potential Differences from Expectations

Given that the connection between the AP-101 assembly language and the System/360 assembly language is undocumented (in surviving documentation) and is based only upon my own inferences, it's not surprising that there are some discrepancies between theory and practice, or between what I've implemented in ASM101S vs what's documented for IBM 360 assembly language.  I'll explain those differences in the subsections below.

Assembly Listings

By an "assembly listing", I mean a printout from the assembler itself, typically showing how each line of source code has been transformed into binary codes, and providing useful extra information such as symbol tables and other cross references.  You've seen an example (for the ACOS module) earlier.

Unfortunately, in spite of the claims to the contrary I've made in earlier sections, there are no surviving assembly listings produced by the AP-101S original assembler that I'm aware of, or even substantial fragments of such listings.  (If you notice any, be sure to call my attention to them!)  Therefore, without any of the original assembly listings to mimic, assembly listings as produced by ASM101S are unlikely to match those of the original assembler with exactitude .... though of course I expect the same binary codes to be produced at the same addresses, since if not, then the entire exercise of creating ASM101S in the first place would be pointless.  But even if I had such original assembly listings, one wouldn't expect them to be any guide as to the wording or format of warning or error messages produced by the assembler, since any Space Shuttle flight software source code available for assembly presumably would be error-free, at least to the point that no warning or error messages are likely to appear in any assembly listings.

With that said, there is some assembly-listing-like material available, and those are what I have referred to in earlier sections.  Among the files presently publicly visible, I refer to the folder called RUNLST in our source-code repository, which naively appears to be assembly listings generated by assembling the files in the repository's RUNASM folder.  RUNASM contains the AP-101S assembly language source code, in conjunction with the macro library folder RUNMAC, and assisted by the interface-file folder ZCONASM, for the runtime library used with AP-101S object code created by the HAL/S compiler, HAL/S-FC.

Upon close inspection, however, the contents of RUNLST cannot actually have been produced directly by the original AP-101S assembler.  And similarly for materials not presently publicly visible.  I assume, rather, that listings produced by the original assembler were stored somehow, probably in a so-called partitioned data set (PDS), and that the listings in RUNLST were produced by running some kind of report generator on those stored listings.  Here's a fragment of the listing RUNLST/ACOS that we've seen earlier:

						.
.
.
 28 ACOS AMAIN ACALL=YES 00002200
29+***********************************************************************
30+*
31+* PRIMARY ENTRY POINT
32+*
33+***********************************************************************
00000 34+ACOS CSECT 01-AMAIN
00000 35+STACK DSECT 01-AMAIN
36+* DS 18H STANDARD STACK AREA DEFINITION
00000 37+ DS F PSW (LEFT HALF) 01-AMAIN
00002 38+ DS 2F R0,R1 01-AMAIN
00006 39+ARG2 DS F R2 01-AMAIN
00008 40+ DS F R3 01-AMAIN
0000A 41+ARG4 DS F R4 01-AMAIN
0000C 42+ARG5 DS F R5 01-AMAIN
0000E 43+ARG6 DS F R6 01-AMAIN
00010 44+ARG7 DS F R7 01-AMAIN
45+* END OF STANDARD STACK AREA
00012 46+SAVE6 DS D TO SAVE REGISTERS F6,F7 02-00025
00016 47+SWITCH DS F TO SAVE R4 ACROSS INTRINSIC CALL 02-00026
00018 48+STACKEND DS 0F END OF COMBINED STACK AREA 01-AMAIN
00000 49+ACOS CSECT 01-AMAIN
0000000 50+ USING STACK,0 ADDRESS STACK AREA 01-AMAIN
00000 E0FB 0018 0018 51+ IAL 0,STACKEND-STACK SET STACK SIZE 01-AMAIN
00002 B624 0000 0009 0000 52+ NIST 9(0),0 CLEAR ON ERROR INFO (LCL DATA PTR) 01-AMAIN
54 *COMPUTES ARC-COSINE(X) OF SINGLE PRECISION SCALAR 00002300
55 INPUT F0 SCALAR SP 00002400
0000000 56+F0 EQU 0 01-INPUT
58 OUTPUT F0 SCALAR SP RADIANS 00002500
.
.
.

To anybody who is familiar with assembly language, this certainly looks like an assembly listing produced by an assembler, so why do I say that it's not?  The first clue is the line numbering:  There's a line 52 and a line 54, but no line 53.  And there are lines 56 and 58, but no line 57.  Admittedly, it's not 100% certain why that is, but having tried to track it down, it appears to me that both of those gaps correspond to uses of the SPACE pseudo-op appearing in expansions of the AMAIN and INPUT macros respectively.  According to the assembly-language manual, "The SPACE instruction is used to insert one or more blank lines in the listing."  And if so, where are the blank lines that should have been inserted?

Another clue, not apparent from the fragment above, is in the number of lines per page of the printout.  Originally, an assembly listing would have been output to a line printer having (nominally) ~55 lines per page.  Whereas the file in RUNLST have about 80 lines per page.  Nor are there any embedded form-feed characters or other means to advance to the top of the next page before a page heading is printed.  So I would again infer that these are not the original assembly listings.  (On the other hand, I wasn't there, and I don't really know what printers were available to the developers, so arguing merely from the number of lines on the page isn't as conclusive as I might like.)

Regardless, the files of RUNLST are the best guide available as to the format of assembly listings, and hence ASM101S mimics that format to the extent feasible (i.e., to the extent not too pathetically obsessive), plus the addition of convenient indications of page breaks.

So when I refer throughout this webpage existing/surviving/legacy/contemporary "assembly listings", keep in mind that I'm referring to legacy reports such as those in RUNLST and not literally to assembly listings produced by the original assembler.

Character Set

The AP-101 character set does not match that of the System/360 assembler.  The latter is the EBCDIC character set, or rather the variation of EBCDIC listed in Appendix A of the System/360 assembly-language manual.

On the other hand, as far as I can tell, the AP-101 assembly-language character set is not defined anywhere, and can only be inferred indirectly.

Examining character strings appearing in object files output by the HAL/S compiler HAL/S-FC, I find that quoted strings in HAL/S are in fact encoded in the ASCII-like character set of the Space Shuttle's Display Electronics Unit (DEU).  Whereas all other text in HAL/S source code, such as symbol names, is encoded in EBCDIC.  There are no corresponding surviving object files produced by assembly (rather than compilation), but my tentative operating principle is that all text in object files produced by the assembler is encoded in EBCDIC.  For more explanation, as well as a listing of the DEU character set and its encoding, see the discussion of character encoding on the HALLINK101S page.

But wait, there's more!  Any Space Shuttle flight-software source code available from Virtual AGC will have been "anonymized" by replacing personal names or initials with randomized identifiers beginning with either the ASCII carat (^) or backslash (\) characters, and thus either of these characters may appear in HAL/S or AP-101S source code even though they do not appear in the EBCDIC character set.

Finally, for technical reasons, ASM101S wants to reserve some ASCII character absent from the EBCDIC character set to represent breaks between punch-cards and their continuation cards (if any).  The ASCII brace characters { and } meet these criteria.  Therefore, ASM101S reserves them for its own internal purposes, and they should not be used in any newly-written AP-101 assembly-language source code, if such a thing ever exists.  They are not used in any extant AP-101 assembly-language source code available to me.

Instruction Aliases

In IBM 360 Basic Assembly Language (BAL), various aliases exist for the branch instructions BCR and BC.  These are described in Figure 4-1 of the assembler-language manual.  While it is tempting to say that Figure 4-1 should be accepted as-is for AP-101S assembly language, that's unfortunately impossible:  Conditional-branch instructions encode a "mask" to be applied to the CPU's condition codes, but the mask is 4 bits wide for System/360 and only 3 bits wide for AP-101S.

Something has to give!  But Figure 4-1 does serve as a starting point for reverse-engineering AP-101 aliases for conditional-branch instructions.  Here's my own list of AP-101S mnemonics for these branch instructions, grouped by condition-code mask.  Except where otherwise indicated, they're all aliased to the BC instruction; but where marked in parentheses, they're instead aliased to BCR or BVC.

  1. NOP,NOPR(BCR) — No Operation.
  2. BH,BO,BP — Branch on High, Branch Over, Branch on Plus
  3. BL,BM,BN — Branch on Low, Branch on Minus, Branch on Negative
  4. BNE,BNZ — Branch on Not Equal, Branch on Not Zero
  5. BE,BZ — Branch on Equal, Branch on Zero
  6. BNL,BNM,BHE,BNN — Branch on Not Low, Branch on Not Less Than, Branch on Higher or Equal, Branch on Not Minus
  7. BNH,BNP,BLE,BNO,BNC(BVC) — Branch on Not High, Branch on Not Plus, Branch on Less-or-Equal, Branch Not Over, Branch on No Carry.
  8. B,BR(BCR) — Unconditional Branch
Note:  While the mnemonics and condition masks in the list above are accurate (I hope!), the textual descriptions are less certain and should be taken with a grain of salt.

Aside:  While I call this "aliasing to
BC", in analogy to System/360, that's not exactly what's going on in AP-101S.  In fact, in AP-101S each of these branching instructions (except NOPR and BR) are encoded as one of three different instructions (BC, BCB, or BCF), with the particular instruction chosen being dependent on the direction and distance of the branch being attempted.  Similarly, a BCT instruction (branch on count) may instead generate the shorter machine code BCTB (branch on count backward) for short enough loops.

LHI:  Besides the branch-instruction aliases, Shuttle flight-software code uses the operator LHI, but without any AP-101 instruction or any macro definition corresponding to it.  There is such an instruction in IBM 360 assembly language.  The AP-101S POO notes in its discussion of the LA instruction that there is a particular configuration of operands for which LA will be "functionally equivalent to a LOAD HALFWORD IMMEDIATE instruction".  My guess is that the original assembler therefore accepted the mnemonic LHI but silently transformed it in the appropriate LA instruction.  ASM101S treats it in that manner as well.

SHI:  Similarly, flight software uses the non-existent SHI instruction.  The program comments at those points clearly indicate that this is a kind of subtract-immediate instruction, presumably Subtract Halfword Immediate.  Unlike the case of LHI, there is no corresponding SHI instruction for System 360.  Nevertheless, we might suppose that the case is still similar, in that this could be an alias for (perhaps) a particular configuration of operands for some other AP-101 instruction.  Fortunately, we have plenty of examples of assembly listings for code using SHI.  Consider this example:

B0E5  FFFE                   SHI    R5,2

The value 0xFFFE is a halfword with the value -2, which leaves us to suspect that this is actually an addition.  There is indeed an Add Halfword Immediate instruction (AHI), and "AHI R5,-2" would indeed assemble as shown.

LACR:  There is no corresponding System/360 instruction to guide our thinking.  However, there are lots of examples in AP-101 assembly listings, such as those for the CTOI.txt file of the HAL/S-FC runtime library.  LACR is seen to be a register-to-register operation.  For (say) general-registers N and M, it assembles to the bit pattern 11101nnn 11101mmm.  This is the same pattern that the LOAD ARITHMETIC COMPLEMENT (LCR) instruction assembles to.  Therefore, LACR is nothing more than a synonym for LCR.

PC:  Similarly, this undocumented instruction is found from available assembly listings to assembly as a synonym for MVH (move halfword).  There's no rationale obvious to me for the specific mnemonic "PC" for this operation.

Unused Pseudo-ops

Not all pseudo-ops described in the System/360 assembler manual appear in surviving AP-101 assembly-language source code.  I've chosen to believe that rather than the omissions being coincidental, those pseudo-ops are instead specific to System/360 and thus had been entirely omitted from AP-101 assembly-language.  Admittedly, that inference is probably wrong in the case of certain of the pseudo-ops.  Nevertheless, they have not been implemented in ASM101S

The omitted pseudo-ops are:

Obviously, this list is subject to change, if legacy AP-101 assembly-language source code using any of these pseudo-ops is discovered.

Mystery Pseudo-ops

The SPOFF and SPON pseudo-ops — if they are pseudo-ops — seem typically to be used in pairs:  SPOFF is used to disable something unknown, then an instruction or two later, SPON is used to re-enable whatever it was that SPOFF disabled.  They are not pseudo-ops in IBM 360 assembly language, and hence must be specific to AP-101S.

Fortunately, we have a few contemporary assembly listings in which these pseudo-ops appear in the source code, and thus their effect can be observed somewhat.  They do not generate any binary, hence they are definitely not instructions of any kind.  Furthermore, they do not affect whether or not the source code they enclose is assembled, nor whether that source code appears in the assembly listing.

I would tentatively conclude that at least for the moment they can simply be ignored, and that's what ASM101S does with them for now.

Forbidden Pseudo-Ops in COPY'd Files

The System/360 assembler manual tells us that assembly-language files included in other assembly-language files via the COPY pseudo-op cannot contain various other pseudo-ops, two of which are MACRO and MEND.  That implies that a COPY'd file cannot contain any macro definitions.  Nevertheless, Space Shuttle flight software has file inclusions that violate this restriction.  Specifically, the files MLIB80/MACSMITH.asm and MLIB80/MACROS.asm do contain macro definitions, and yet are themselves COPY'd into other assembly-language files.  Consequently, this restriction (at least insofar as MACRO and MEND are concerned) does not apply in AP-101 assembly-language.

Macro-Definition Prototypes

The assembler manual tells us that

"The macro instruction prototype statement (hereafter called the prototype statement) specifies the mnemonic operation code and the format of all macro instructions that refer to the macro definition. It must be the second statement of every macro definition."
For example, in a macro definition such as
         MACRO
MYMACRO &ARG1,&ARG2
.
.
.
MEND

no other statements must appear between the first two lines shown here. 

In contradiction to the claim in the manual, though, there are instances in flight-software code in which there are comments between these lines, as in
         MACRO
.* THIS IS A COMMENT
.* THIS IS ANOTHER COMMENT
.
.
.
.* THERE WERE A WHOLE LOT OF COMMENTS, SEE?
MYMACRO &ARG1,&ARG2
.
.
.
MEND

I guess we'd infer from this, and very reasonably, that comments are not "statements", but more importantly, that the macro prototype is not necessarily the second line in a macro definition.
Aside:  I don't know if anybody will read these words, ever, but my sixth sense tells me that some folks who do might be smugly saying to themselves right now that "of course full-line comments are not 'statements' in any language, so what's this fool on about?"  As it happens, on p. 69 of the assembler manual, we find a section actually entitled "Comments Statements", which proceeds to define the term comments statement as being precisely the thing we're discussing right now.  <img src="smiley.png">

Silly Suffixes

Aside:  AP-101 CPU instructions fall into 5 categories, depending on the pattern of operands they accept.  These 5 categories are designated RR, RS, SRS, SI, and RI.  The differences between these relate to the number of operands and the means of addressing them, but the specifics aren't important for our discussion here.

All AP-101 CPU instructions of type RS can optionally have suffixes "@", "#", or "@#" added to their mnemonics.  For example, just as there is an SCAL instruction of type RS, there are also SCAL@, SCAL#, and SCAL@# instructions of type RS.

To be picky about it, this usage is indeed documented, but it took me so long to figure out that I thought I should take explicit notice of it here anyway.

The AP-101S POO tells us that

"... [@] [#] indicates that the use of indirect addressing and/or autoindexing is optional.  For example, [instruction mnemonic] M specifies direct addressing without autoindexing, while M# specifies direct addressing with autoindexing."

And in case it's not obvious to you what the POO means by "indirect addressing" and/or "autoindexing", there is much greater detail in the POO's explanation of the general characteristics of RS-type instructions, though you won't be any wiser about the origin or rationale of the term "autoindexing" after reading the explanation than you have beforehand.
Aside: As usual, I suppose, the implications of this are more complex than might be supposed at first glance.  Not to mention probably being more than you want to know!  Technically, if the @ suffix is present, a bit-field known as "IA" in the encoded machine instruction is set to 1 (vs 0 in the absence of the suffix).  Similarly, a bit-field known as "I" in the encoded instruction is set to 1 in the presence of the # suffix.  The "more-complex" implication is that the "I" field may be set to 1 sometimes even in the absence of the # suffix.  This happens, for example, in a branch backward to an earlier address (vs a branch forward to a later address).  In such a case, the assembler sets the "I" bit on its own, without the suffix #, because one of the special addressing modes I alluded to earlier is the case where you have an RS-type instruction
OPCODE R1,D2(X2,B2)
in which
If so, then at execution time the displacement D2 is subtracted rather than added to the updated instruction counter.  In particular, it allows backward branches from the current location.

Mystery Instructions or Macros

The following operators appear in flight software source code, and I am so far unable to determine if they are supposed to be instructions, macros, or pseudo-ops:

Syntax of Various Fields

Recall that the four fields potentially present in a line of assembly language (whether instructions, pseudo-ops, macro invocations, etc.) are the name field (beginning in column 1), the operation field, the operand field, and the comment field.  It turns out that parsing these fields is quite tricky, particularly the operand field. 

I won't bore you with the details as to why this is so, but simply say that except for the comment field, each of these fields has been given its own simple BNF-style grammar in ASM101S, and sometimes multiple separate grammars that are applied for different contexts.  This is, of course, transparent to the user of ASM101S, and is only significant to someone wishing to maintain the assembler.
Aside:  "BNF", of course, stands for Backus-Naur form.  Technically, the grammars are actually written in the modified EBNF (Extended Backus-Naur form) supported by the TatSu parser module for the Python language.  See the Python source-code file fieldParser.py for the grammars themselves.
Nevertheless, even having adding this level of complexity to the parser, it's not necessarily the case that the syntax parsed by ASM101S matches that parsed by the original assembler.  For example, arithmetic expressions as specified by the System/360 assembly-language manual are constrained in various ways — e.g., cannot begin with '+' or '-', cannot have have more than 16 terms, cannot have more than 5 levels of parentheses —, but have not been endowed with the same constraints in ASM101S.  On the other hand, I haven't necessarily bothered to implement theoretically-possible syntax that isn't present in actual flight software.  Consequently, it's likely that ASM101S accepts a more-complex syntax in some contexts than did the original assembler, and vice-versa.  Or course, ASM101S can be upgraded as needed to support such missing syntax, if it turns out to be desirable, whereas the original assembler cannot.

EQU and CPU Registers

The AP-101 CPU has 8 general registers, typically referred to symbolically in assembly language as R0 through R7, as well as 8 floating-point registers, typically referred to as F0 through F7.  This is the same situation as in System/360 assembly language, except that in System/360 there are more of each kind of register.  For example, an assembly-language instruction that performs an integer addition from register R7 to register R3 would look like this in either of the two assembly languages:

AR	R3,R7
But there's a catch.  The assembly-language manual explains that
"All symbols that specify register numbers ... must be assumed to be equated elsewhere to absolute values."
In other words, the register-name symbols R3 and R7 in this example are not tokens or syntactical elements of the assembly language, and the pure syntax for the instruction example shown above should actually be this:
AR	3,7
The only reason that the former instruction would be accepted by the assembler, the manual is explaining, is that the full example should have read something like this:
R3      EQU     3
R7 EQU 7
.
.
.
  AR R3,R7
In turn, this means that in the macro libraries loaded by the assembler, we should should find various EQUates similar to the ones above, for the general registers and floating-point registers.  And indeed, for the macro libraries used for the Space Shuttle primary flight software (PASS), and backup flight software (BFS), we find exactly such declarations in the PASS module MLIB80/MACSMITH or the BFS module MLIB80/EQU, along with numerous other EQUates of a similar nature:
         .
.
.
F0 EQU 0 FP 0 = FLOATING POINT REGISTER
F1 EQU 1 1
F2 EQU 2 2
F3 EQU 3 3
F4 EQU 4 4
F5 EQU 5 5
F6 EQU 6 6
F7 EQU 7 7
G0 EQU 0 SET 1 GR 0 = GENERAL REGISTER
G1 EQU 1 1
G2 EQU 2 2
G3 EQU 3 3
G4 EQU 4 4
G5 EQU 5 5
G6 EQU 6 6
G7 EQU 7 7
R0 EQU 0 SET 2 GR 0 = GENERAL REGISTER
R1 EQU 1 1
R2 EQU 2 2
R3 EQU 3 3
R4 EQU 4 4
R5 EQU 5 5
R6 EQU 6 6
R7 EQU 7 7
.
.
.

Unfortunately, that's not the full story.  Besides the flight software as such, AP-101 assembly-language files also exists in the runtime library provided by HAL/S-FC, the HAL/S compiler.  Those assembly-language files reference the CPU general registers and floating-point registers just as any of the flight-software files do, except that there are no EQUates for those registers in any of those source-code files, nor in the macro library used by those files.

It is, of course, possible that the reason these EQUates are missing is that our HAL/S-FC runtime-library source code is incomplete.  Unfortunately, there is no way to know whether that is correct or not.  Another possibility is that the System/360 assembly-language manual is incorrect, and that the assembler does by default recognize the general registers Rn and float-point registers Fn, and possibly other symbols, without explicit EQUates.

Lacking any palatable alternatives here, ASM101S assigns default values to the various register symbols, but allows those defaults to be overridden by explicit EQUates, if such are encountered.

Type Attributes, T'

The System/360 assembly-language manual tells us prefixing a symbolic variable (such as &A) with the notation T' returns an assembly-time string consisting of a single character that corresponds to the type of data the variable contains.  For example, if &A were a character-string variable as declared via the GBLC or LCLC pseudo-op, then the assembler's preprocessor would replace T'&A by the single character C at assembly-time.

The manual lists 27 such "types", corresponding to the "letters" A-Z and $.  (In the worldview of the assembler, 29 characters are defined as being "letters":  A through Z, #, $, and @.)  But it isn't clear at the present time how many of these types will be supported in ASM101S, since only the following seem to appear in actual flight-software source code:

It isn't entirely clear to me what # indicates.  My current very tentative interpretation is this:

The D' Attribute

AP-101S assembly-language source code uses an attribute operator D', which is not defined in the assembly-language manual.  From the way it is used, I infer when applied to an identifier, it returns "true" (1) if the identifier has been previously defined within the source-code being assembled and "false" (0) if not.  A typical usage would be something like

	AIF     (D'MYSYM).OKAY
EXTRN MYSYM
OKAY ...

Thus if the identifier is not defined, it allows the code to detect that condition and to mark the identifier as being declared externally.

AIF and AGO

The AIF and AGO pseudo-ops provide "goto" functionality (respectively conditionally or unconditionally) at assembly time (rather than at runtime).  The System/360 assembly-language manual makes it clear that these "goto" operations can operate only with the same macro depth, and further, if within a macro, only within the same macro.  For example, in the "pseudo-instruction"
AGO .MYSEQ 
the locations of the pseudo-instruction itself and of the sequence symbol .MYSEQ could be both outside of any macro, or they could be within the same macro definition.  But it could not be the case (say) that the pseudo-instruction was within a macro definition and the sequence symbol was within a macro invoked by that macro.

One important case about which the assembly-language manual says nothing, I think, is the case in which a source-code file is being imported via a COPY pseudo-op.   Is it possible for the AGO or AIF pseudo-instruction to be in a file containing a COPY pseudo-op while the target sequence symbol is in the file being COPY'd?  Or vice-versa?

ASM101S does not allow the case just mentioned.  In other words in ASM101S, for any file being imported via COPY, any AGO/AIF pseudo-instruction and its target sequence symbol must reside within the same COPY'd file.

I do not presently know if this usage occurs within Shuttle flight software or not. 

Arithmetical Peculiarities and Evaluation of Expressions

Certain arithmetical quirks are inherent in System/360 assembly language, and I must presume that these peculiarities carry over into AP-101S assembly language as well. Therefore, ASM101S retains these peculiarities rather than eliminating them.

The peculiarities I regard as worth noting are these:

On the other hand, ASM101S does remove some of the constraints of System/360 assembly-language arithmetical restrictions, namely:

Aside:  Regarding peculiarities of my own making, as opposed to those of the language itself or the original assembler, I'm obliged to admit that I don't quite understand how to perfectly handle assembly-time evaluation of arithmetic expressions involving program labels: i.e., involving the addresses of symbols rather than the values of constants. 

To do so, ASM101S instead uses an imperfect trick, making use of the facts that the address space of the AP-101S is limited to 24 bits and that the number of allowed control sections in a program (at least in System/360) is limited to 255.  The addresses of program labels (prior to linking) is precisely an ordered pair of the form (control section, offset into control section), but performing arithmetical computations is easiest when these values can somehow be converted to single numbers rather than ordered pairs.  The trick is to assign each control section a unique but randomized 64-bit value whose least-significant 24 bits are all 0, and to convert addresses of symbols to a sums of these 64-bit values plus 24-bit offsets into the control sections.  (I don't mean that the codes for the symbols are actually random, but rather that they are selected in a way that makes it unlikely to produce their values by common types of calculations.)  In this way, calculations like SYMBOL+OFFSET or SYMBOL1-SYMBOL2 (for symbols in the same section) produce the expected results, and indeed, produces correct results for all correct expressions.  Unfortunately it remains possible to combine symbols in an incorrect manner from two different control sections and get a result that appears to be in yet a third control section, which is incorrect.  This potential is part of the reason for using 64-bit pseudo-addresses (and distributing the unique numerical codes for the control sections throughout a 40-bit space) rather than 32-bit pseudo-addresses (and distributing the unique numerical codes in an 8-bit space):  It reduces to a very low level the probability of producing "fake" control sections in calculations.

According to the System/360 assembly-language manual, although EXTRN symbols can appear in expressions, they cannot be paired.  This implies, I think, that they can be handled interoperably with the description in the preceding paragraph, by using unique but randomized 64-bit values with the lower 24 bits all 0 in place of those symbols.

I thought at first that the same trick could be used to handle calculations other not-yet-defined symbols.  Unfortunately, such an attempt would be guaranteed to produce incorrect results in calculations like KNOWN-UNKNOWN, even if KNOWN and UNKNOWN both turned out to be members of the same control section.  Therefore, the addresses of all symbols in the current file must be ascertained in a separate pass before computations of expressions involving such symbols are performed.

Relational Expressions Involving Strings

Among the types of expressions computed by the assembler at assembly-time for use with pseudo-ops such as SETB or AIF are the boolean expressions, of which one sub-type is relational expressions involving string values.

A relational expression is used to determine that two values (either two numbers or two strings) are equal (EQ), not-equal (NE), less-than (LT), less-than-or-equal (LE), greater-than (GT), or greater-than-or-equal-to (GE) each other.  For example, the relational expression

3 LT 4
returns the value "true" (which in System/360 assembly language is numerically equivalent to 1) since 3 is less than 4.

System/360 assembly language shares the obnoxious (in my opinion!) property of string comparisons in the XPL language that a shorter string is always "less than" a longer string.  For example,
'Z' LT 'AA'
returns "true".

Unfortunately, as far as I can tell, the System/360 assembly-language manual does not explicitly state the collation sequence to be used for comparing strings of equal length, though it seems to me to be implied that it is based on the EBCDIC encoding of the characters. 

As for the AP-101 assembler, however, the conclusion is not so straightforward, and in fact is TBD.

If we use the "cross reference" (symbol table) printed in legacy assembly listings printed by the original assembler — an approach of dubious validity —, we find relationships like this:

digit < alphabetic < '$' < '#' < '='

This is consistent with neither an EBCDIC collation nor an DEU/ASCII collation:

(EBCDIC)        '$' < '#' < '=' < alphabetic < digit
(DEU/ASCII)        '#' < '$' < digit < '=' < alphabetic

Thus in the end we really don't know what collation sequence is appropriate.  ASM101S temporarily pretends that the collation sequence is ASCII since that's the easiest to implement.

Character Expressions

Character expressions consist of text delimited by single-quotes, as for example 'HELLO', plus various additional flourishes that you can read about in the System/360 assembly-language manual but which I won't bother to rehash here.

One flourish which must be mentioned is the so-called substring notation, which can be used to extract a substring from a string, as in:
'HELLO'(start,length)
This means that the substring to be extracted begins at index start and is length characters in width.

One trivial detail which the manual doesn't seem to think worth explicitly mentioning, as far as I have been able to ascertain so far, is whether the indexing of the string characters is 0-based or whether it is 1-based.  As curious as it seems, sticklers for detail might think this information could be valuable from time to time.  If you read far enough into the manual, there are eventually a couple of examples which indirectly demonstrate that indexing of the string is 1-based.

Declaration of "SET Symbols"

Before describing the specific AP-101S versus System/360 issue associated with the items known as "SET symbols", let me summarize some of what the System/360 assembly-language manual has to say about them.

In System/360 assembly language there is the concept of symbols relevant only to the assembler in a preprocessing pass, in distinction to symbols representing addresses in the runtime memory of the assembled program.  These symbols are distinguished in that their names are prefixed by the character '&'.  Thus MYVAR might be a variable representing a memory location, whose contents can be modified by the assembly-language program when it is run, while &MYVAR might represent an assembly-time variable, assigned a value that can be manipulated during the assembly process, but that is not known or modifiable by the assembled program.

These assembler-only variables can be classified in a number of ways, one of which is that they can be of one of three mutually-exclusive types:
  1. "Symbolic parameters" are the formal parameters found in macro definitions.  They are assigned values at the time of the invocation of the macro, but cannot otherwise be changed.
  2. "System variables" are assigned values by the assembler itself, conceivably different on each use of the variable, and cannot be changed by software.  They're distinguished by the fact that they always begin not just with "&" but with the 4-character string "&SYS".
  3. "SET symbols" can be explicitly created, assigned values, and reassigned values by software at will.

Here, we're concerned only by the latter category, namely the SET symbols.

SET symbols can be categorized another way, namely by their datatypes, which cannot be changed once established.  The three types are:

  1. Integer
  2. Boolean
  3. Character string

Yet a third way that they can be characterized is as:

Prior to the first use of any SET symbol, it must be declared via of the macro-language instructions GBLA, GBLB, GBLC, LCLA, LCLB, or LCLC.  Any of these instructions also assigns an initial value the symbol, either 0, False (0), or '' (empty string), depending on the datatype.   For example, the instruction "LCLB &BOO" declares a local boolean SET symbol called &BOO and assigns it the default value False (numerically, 0).

After declaration, the value of a SET symbol may be changed (within its global or local scope, as appropriate) via one of the macro-language instructions SETA, SETB, or SETC.

Okay, that was the background, but here's the AP-101S specific issue:  In actual AP-101S assembly-language source code, there are SET symbols modified by SETA, SETB, or SETC (or used in other manners) without any declaration via GBLA, GBLB, GBLC, LCLA, LCLB, or LCLC whatsoever (prior or otherwise), which is a possibility denied by the System/360 assembly-language manual.

For example, consider the INPUT macro, provided as part of the AP-101S runtime library by the original source code of the HAL/S compiler HAL/S-FC.  It has four SET symbols that are used without declarations, highlighted in green in the listing below:
         MACRO                                                          00000100
INPUT &X 00000200
GBLA &ENTCNT 00000300
GBLB &INPUT(20),&LIB 00000400
AIF (N'&SYSLIST EQ 0).EMPTY 00000500
&INPUT(&ENTCNT) SETB 1 00000600
AIF ('&X' EQ 'NONE').SPACE 00000700
&I SETA 1 00000800
&LAST SETA N'&SYSLIST 00000900
.LOOP AIF (K'&SYSLIST(&I) NE 2).BADREG 00001000
&R SETC '&SYSLIST(&I)' 00001100
AIF ('&R'(1,1) NE 'F' AND '&R'(1,1) NE 'R').BADREG 00001200
AIF ('&R' EQ 'R0').BADREG 00001300
AIF (&LIB AND ('&R' EQ 'R1' OR '&R' EQ 'R3')).INVREG1 00001400
AIF (NOT &LIB AND '&R' EQ 'R4').INVREG2 00001500
AIF (D'&R).NEXT 00001600
&N SETC '&R'(2,1) 00001700
&R EQU &N 00001800
.NEXT ANOP 00001900
&I SETA &I+1 00002000
AIF (&I LE &LAST).LOOP 00002100
.SPACE SPACE 00002110
MEXIT 00002200
.BADREG MNOTE 4,' ILLEGAL REGISTER SPECIFICATION - &SYSLIST(&I)' 00002300
AGO .NEXT 00002400
.INVREG1 MNOTE 4,'&R INVALID INPUT FOR PROCEDURE ROUTINE' 00002500
AGO .NEXT 00002600
.INVREG2 MNOTE 4,'R4 INVALID INPUT FOR INTRINSIC' 00002700
AGO .NEXT 00002800
.EMPTY MNOTE 4,'OPERAND REQUIRED' 00002900
MEND 00003000
What are we to make of this?

Upon considerable reflection, my inference is that the AP-101S has a built-in convenience feature, either not present or not documented in the System/360 assembler, namely this: 
When a variable that has not previously been explicitly declared (by GBLx or LCLx) is the target of a SETx instruction, it is declared automatically by the assembler as if via LCLx.
Aside:  If this inference is correct, it might seem naively that there's no need for the instructions LCLA, LCLB, or LCLC at all, since a SETA, SETB, or SETC could always be used instead.  Upon closer inspection that's not true, since LCLx (like GBLx) can additionally be used to declare SET symbols as arrays, which a SETx instruction with this convenience feature could not.  And even in the non-arrayed case, there are certainly instances in existing code in which LCLx is indeed used explicitly even though the described convenience feature would not require it.  For example, consider this macro from the AP-101S runtime-library source code, which unlike the problematic macro listed above corresponds exactly to the System/360 assembly-language manual's pronouncements:
         MACRO                                                          00000100
&NAME AERROR &NUM,&GROUP=4 00000200
GBLA &ERRCNT,&ERRNUMS(10),&ERRGRPS(10) 00000300
LCLA &I 00000400
AIF (&NUM GT 62).BADNUM 00000500
&I SETA &ERRCNT 00000600
.DUPLOOP AIF (&I LE 0).NEWERR 00000700
AIF (&NUM EQ &ERRNUMS(&I) AND &GROUP EQ &ERRGRPS(&I)).DUP 00000800
&I SETA &I-1 00000900
AGO .DUPLOOP 00001000
.NEWERR ANOP 00001100
&ERRCNT SETA &ERRCNT+1 00001200
&I SETA &ERRCNT 00001300
&ERRNUMS(&I) SETA &NUM 00001400
&ERRGRPS(&I) SETA &GROUP 00001500
.DUP ANOP 00001600
*********ISSUE SEND ERROR SVC****************************************** 00001700
&NAME SVC AERROR&I ISSUE SEND ERROR SVC 00001800
*********SEND ERROR SVC RETURNS CONTROL FOR STANDARD FIXUP************* 00001900
MEXIT 00002000
.BADNUM MNOTE 12,'ERROR NUMBER GREATER THAN 62' 00002100
MEND 00002200

As for the origin of such a convenience feature in the first place, I'd note that in addition to being "convenient", the complexity of some AP-101S macros could make some of those macros very difficult or impossible to implement otherwise.  According to System/360 rules, all GBLx and LCLx instructions must appear not merely before SETx instructions involving the SET symbols they declare, but indeed prior to everything else.  For example, GBLx instructions must appear immediately after the prototype line of a macro definition, with nothing intervening except comments, while LCLx instructions in turn must appear immediately after that.  Thus if a macro definition depends on the flexibility of allowing a SET symbol to be declared in alternate ways under different circumstances, such as arrayed vs non-arrayed or integer vs character, the rules of the System/360 assembler likely would not allow it because alternate declarations could appear in the prescribed location.  Whereas the rules of implicit declaration via SETx instructions basically allow non-arrayed local declarations to appear anywhere.  So the convenience feature of implicit declaration, if it truly exists, could have arisen from necessity rather than from a desire for mere convenience.  Not that "mere" convenience is to be sneered at.  But that's just speculation on my part, with the answer lost in the mists of time past.

DC and DS Pseudo-Op Formats, and "Literals"

The System/360 assembly-language manual describes a quite-complex format for the operands of the DS and DC pseudo-ops used for allocating or initializing data memory.  (The description takes about 11 pages, which is over 6% of the manual.)  However, I see no point in implementing those features of this format which are not actually used in Space Shuttle flight-software source code.  At present, I believe that the following features of the DC/DS format do not need to be supported in ASM101S:

The System/360 manual uses the term literal to refer to an operand (for instructions) in a manner differently than I have been doing (and differently from what I think is current common usage).  I have been using the term to describe strings of characters such as 1234 or 'HELLO WORLD', or perhaps X'3F7C'.  System/360 (and presumably AP-101S) assembly language don't consider these to be "literals".  Rather, the following would be considered "literals":  =F'1234', =C'HELLO WORLD', or =X'3F7C'.  The distinction, aside from prefixed equals sign and other syntactic elements, is that the former are used directly by DC pseudo-ops or in some cases coded into instructions, whereas the latter are instead assembled into special areas of memory known as "literal pools" and only their addresses are coded into instructions.

The reason this is relevant here is that ASM101S doesn't support the full range of literals accepted by the System/360 assembler, and instead limits itself just to those forms of literals appearing in Space Shuttle flight software, plus any others I just feel like implementing.  (The latter being a tiny number indeed!)  Those supported forms of literals are:
=B'...'		Binary
=C'...' Character
=D'...' Double-precision floating point
=E'...' Single-precision floating point
=F'...' Fullword fixed-point
=H'...' Halfword fixed-point
=X'...' Hexadecimal
=Y(...) Nearby address
=Z(...) Remote address

Regarding the =F'...' and =H'...' datatypes, see here to understand the significant differences between how the AP-101S and System/360 assemblers treat them. 

In addition to these bare forms, the System/360 assembly-language manual also describes a "length modifier" for them, in which the initial letter specifying the datatype is optionally followed by "Ln", where n is a decimal number indicating the number of bytes of memory allocated.    Similarly, for fixed-point datatypes (i.e., =F and =H) the "scale modifier" ("Sn") is supported.  If both are present, the scale modifier must follow the length modifier. 
Aside:  The System/360 manual also describes variations on the allowed modifiers, such as "L(e)" (where e is an arithmetic expression), scale modifiers for floating-point datatypes (=E and =D), an "exponent modifier" for fixed-point and floating-point datatypes, and so on.  ASM101S doesn't support any of these variations since they don't appear in Shuttle flight software source code.
While ASM101S supports these length and scale modifiers, the length modifier in AP-101S assembly language does not appear to be used as described in System/360.  Consider this AP-101S instruction, which appears in the MSTR module of the HAL/S-FC runtime-library:
00013 27F7 0011      0026 0011   57          N     R7,=XL2'F'                                               00002200
.
.
.
00024 75+ LTORG 02-ERRPA
00024 FFFF000F 76 =X'FFFF000F'
00026 0000000F 77 =XL2'F'
00028 FFFFFFF0 78 =X'FFFFFFF0'
The length modifier "L2" in the literal =XL2'F' clearly indicates that even though the provided constant value (F) is only one hexadecimal digit, it must assemble to 2 bytes, i.e. to hexadecimal 000F.  And yet, in reality, we find that it has assembled to 4 bytes, as hex 0000000F.  In point of fact, since an =X literal by default assembles to an even number of bytes (according to the System/360 manual), the length modifier should not even have been necessary in the first place, since it would merely be explicitly expressing the default.

What are we to make of this?  Unfortunately — or perhaps fortunately, depending on your point of view! — there are not enough length modifiers appearing in surviving AP-101S assembly-listing reports for us to form much of an opinion on the matter just by examining those reports.  In point of fact, the length modifier I've given in the example above is the only length modifier I've found in surviving assembly listings.  Perhaps in AP-101S, length modifiers express the number of halfwords in the assembled value rather than the number of bytes?  That's the way ASM101S treats them at present.

As a less-unruly example, scale modifiers appear to operate on an easier-to-understand basis than length operators.  Thus, the literal =FS32'60E6' superficially appears to assemble to 60000000 (decimal, i.e. 0x03938700), but because of the scale modifier it is actually shifted rightward by 32 bit-positions, i.e. divided by 232.  So the value it assembles to is instead approximately 0.013969839.  But that still doesn't mean that everything is exactly as in System/360.  In AP-101S, the fixed-point literals have the interesting property (see the link given above about =F and =H datatype differences from System/360) that if they appear to be integers, then they assemble into 2's-complement integers, while if they appear to be fractional values with absolute value less than 1.0 they instead assemble into an alternate format maximizing the significance in that numerical range.  That's what happens in this example.  The literal assembles to 0x01C9C380, which amusingly, is just 60000000/2.

The =Z(...) format in AP-101S is unrelated to the =Z'...' ("zoned decimal format") of System/360, which is not supported by ASM101S and is best forgotten entirely.  The =Z(...) of AP-101S has three parameters, thusly:
=Z(ARG1,ARG2,ARG3)
Naturally, as you may expect, =Z(...) is entirely undocumented.  It is used in precisely one place, namely in the FCMNNIT module of each available version of Shuttle flight software.  In flight software OI-30.17, that usage looks like so in the assembly listing produced by the original assembler:
                                 434          EXTRN FPMXQETB,FPMXQELE,FCMALLFS                               026120BQ
.
.
.
000DE EAF3 0000 0000 438 LA R2,FPMXQETB DESTINATION - START OF XQE TABLE 026140BQ
000E0 E2FB 001E 001E 439 IAL R2,FPMXQECT*2 30 HWS TO INIT - (15 ENTRIES) 026145BQ
000E2 1BF7 00EE 01D2 00EE 440 L R3,=Z(,FPMXQETB+2,0) SOURCE ZCON ADDRESS 026150BQ
.
.
.
001D2 734 LTORG 050300BG
001D2 00020000 735 =Z(,FPMXQETB+2,0)

Since the symbol FPMXQETB is declared as an EXTRN, its address and other characteristics are unknown at assembly time, though in fact FPMXQETB is the start of something called the "XQE table", aligned at a fullword address, a fact presumably known to the human programmer if not to the assembler program.

I think that =Z(...) is used to form "ZCON" addresses.  The ZCON compiler option for the HAL/S compiler (HAL/S-FC), in the words of the "HAL/S-FC User's Manual", has the effect of "[causing] calls to out-of-line routines (external references) to be performed via long indirect address constants".  It appears to me that the term "ZCON" probably stands for "Z constant", as a contrast to "YCON" for constants formed via =Y(...), which are just 16-bit displacements.   The distinction is that LOCAL data (in the HAL/S sense) can be accessed by efficient YCON addressing, whereas REMOTE data (again in the HAL/S sense) is accessed via less-efficient ZCON addressing.

It further appears to me (see Figure 2-14 of the AP-101S POO) that a 16-bit base address fills the top halfword of such an indirect address constant, while an 11-bit displacement (padded with 0's to 16 bits) fills the bottom halfword.  My best guess is that =Z(ARG1,ARG2,ARG3) is perhaps better represented as =Z(TBD,BASE,DISPLACEMENT).

Thus in the assembly-listing excerpts shown above, FPMXQETB+2 is the BASE address, though the external symbol is simply assembled as having address 0, since its address is unknown to the assembler (later to be fixed up by the linker) causing FPMXQETB+2 to assembled as just 2.  Similarly, the DISPLACEMENT is 0.
Aside:  I don't claim that all of the mysteries of =Z(...) have been solved by the meandering inferences above  They do explain, more or less, how the assembler should turn =Z(,FPMXQETB+2,0) into 0002 0000.  But what happens after the object code leaves the assembler and is processed by the linker?  By process of elimination, the assembly-listing excerpts above came from flight software version OI-30.17, because those are the only contemporary assembly listings available to me.  But the associated source code is identical in flight software version OI-34.06, for which I have no contemporary assembly listing, but for which I do have a contemporary disassembly of the linked object code, thus giving us some clues about how the linker treats these =Z(...) literals.  What we find in the linked OI-34.06 is that FPMXQETB ends up at address 008B6A, and that the =Z(,FPMXQETB+2,0) constant itself becomes 8B6C 0001.  The upper halfword is precisely what we would have expected!  But what about the lower halfword?  Why has the linker turned 0000 into 0001?  That's a mystery to me.  In case you're interested, here's what the relevant portions of that contemporary disassembly of the FCMNINIT module in OI-34.06 look like:
 008B6A         FCMCBLKS+0A50                           FPMXQETB DS     0F
.
.
.
018A52-018A53 FCMNINIT+00DE EAF3 8B6A 008B6A LA R2,X'8B6A' FPMXQETB
018A54-018A55 FCMNINIT+00E0 E2FB 001E IAL R2,X'001E'
018A56-018A57 FCMNINIT+00E2 1BF7 00F6 018B4E L R3,X'00F6' =Z''
.
.
.
 018B4E-018B4F FCMNINIT+01DA 8B6C 0001 DC Z

The "=E" (single-precision floating-point) and "=D" (double-precision floating-point) datatypes also require a few additional words of explanation.  While literals (or constants) of these types are provided in AP-101S assembly-language source-code in the usual decimal notation, they are encoded into object code by the assembler in "IBM hexadecimal floating-point" format ("IBM hex").  Regarding this conversion of decimal to IBM Hex, the System/360 assembly-language manual has this to say on the subject (emphasis mine):
"The number is converted to a binary number, and scaling is performed if specified. The binary number is then rounded and assembled into the proper field, according to the specified or implied length.  The resulting number will not differ from the exact value by more than one in the last place."
In other words, the conversion algorithm used by the original assembler was not necessarily exact to the apparent full precision.  Consequently, object code containing floating-point constants as produced by ASM101S cannot be guaranteed to byte-for-byte match the floating-point constants generated by the original assembler.  This potential for inexact conversions is exacerbated by the fact that the native floating-point precision in the Python 3 language in which ASM101S is written does not precisely match the native floating-point precision of System/360; Python 3 is more accurate than System/360 in some cases, and less accurate in others, I believe.  ASM101S works around this latter problem by performing the floating-point operations for such conversions at a higher precision than normal Python floating-point operations, namely 20 significant digits rather than 16.  Empirically, the conversions by ASM101S match the conversions provided by the original assembler in all known cases, but there's no guarantee for presently-unknown cases.  Nor is the range of the two floating-point formats (in terms of powers-of-ten exponentiation) identical; ASM101S makes no attempt to work around this range mismatch, because there are no offending examples in existing legacy source code.
Aside:  Expressed in different terms, ASM101S (presumably!) does not use exactly the same algorithm as the original assembler to perform the conversion from decimal strings to IBM hex.  That's because I don't presently know what that original algorithm was.  Thus ASM101S simply uses a conversion algorithm of my own concoction.  Perhaps at some point the original algorithm may be deduced or recovered somehow.  If that happens, then it can be incorporated into ASM101S, in which case none of the concerns I've been describing will obtain.

I'd also note that while we do not have the source code of the original AP-101S assembler, the System/360 assembler presumably used the same conversion algorithm, and probably even the same source code to implement the conversion algorithm.  I'm told that the source code for one version of the System/360 assembler is online, though I don't provide it in the Virtual AGC library due (in what's probably an excess of paranoia) to copyright concerns.  Someone sufficiently immersed in System/360 assembly language might be able to abstract the conversion algorithm from examination of that source code.  I looked at the System/360 source code myself and concluded, alas, that I am not such a person.  If you want to try it yourself and give me the algorithm in easily-understood pseudo-code (or better yet, in Python 3), the System/360 assembler's source code is the file AS037F1.TXT, supposedly present somewhere in the archive at this hyperlink, though I've been unable to find it there myself.  Good luck!
We might be tempted to disregard some potential precision error in the 32nd or 64th bit as being far too small to matter.  The reason it matters is because it affects validation of the assembler in the following way:  ASM101S has the capability of performing byte-for-byte comparisons of new assemblies vs legacy assemblies (where available), and thus discrepancies in even the least-significant bit could be flagged as errors.  Our (or at least my) criterion for accepting ASM101S as valid is that these automatic comparisons detect no error.   Even if we are prepared to accept such discrepancies as being all right, they nevertheless defeat automated validation.

By the way, regarding the just-mentioned automated comparison of generated code, it should also be noted that the assembly listings only display the first 16 hexadecimal digits of binary data generated by DC pseudo-ops, whereas (of course!) the entire value is available to ASM101S.  Consequently, even if the compiler reports a 100%-successful automated comparison, generated data not printed in the assembly listings has not been checked and may not match.

SRS-Type Instructions Versus RS-Type Instructions

AP-101S instructions are of 5 basic types, designated (by IBM) as RR, RS, SRS, RI, and SI, based on the syntax patterns of their operands and on the ways they are encoded as machine instructions.  Some of these are System/360 patterns, and some are not.  I won't bore you with the details, as you can read about them in the AP-101S Principles of Operation.  However, there is a certain difficulty with SRS- and RS-type instructions — as well as ambiguities between two different flavors of RS-type instructions, namely "extended" (AM=0) vs "indexed" (AM=1) instructions — that could in principle cause a mismatch between object code generated by ASM101S vs the original AP-101S assembler, though hopefully not any behavioral difference at runtime other than slight timing discrepancies.  This group of instructions includes, among other things, all conditional-branch instructions and their aliases.

The greatest difficulty relates to the fact that certain instruction mnemonics are used both for SRS-type instructions and RS-type instructions.  Moreover, while some of the operand patterns for them are accepted for SRS instructions and not RS instructions, thus allowing the assembler to distinguish between them, some of the operand patterns nevertheless overlap.  In case of overlap, there is no syntactic way for the assembler to distinguish between the SRS instruction and the RS instruction.  Overlap occurs for the following syntactical patterns (where R1, D2, and B2 refer to the names of fields in the encoded machine instruction):

OPCODE	R1,D2
OPCODE R1,D2(B2)

The vulnerable opcode mnemonics are:

A AE AH BC C CH D DE IAL L LA LE LH M ME MH N O S SE SH SHW ST STH TD TH X ZH
While the SRS-type and RS-type instructions are (almost) behaviorally identical, they are encoded differently as machine instructions, and in particular require different amounts of memory to do so.  SRS-type instructions are encoded as half-words (2 bytes), while RS-type instructions are encoded as full words (4 bytes).  For example, there is no syntactical way to know whether to encode the load-instruction "L 4,SWITCH" as 2 bytes or as 4 bytes.  So if ASM101S were to encode an instruction as SRS while the original assembler were to encode it as RS, or vice-versa, then not only would the binary forms of those particular instructions differ, but all of the code following that instruction in the same control section would be aligned differently.
Aside:  I say that the SRS-types instruction and RS-type instructions are "almost" behaviorally identical.  Figure 2-2 of the AP-101S Principles of Operation tells us about one difference.  When an SRS-type instruction specifies a base register (B2) equal to 3, it means to really use general-register 3, as one would expect.  Whereas when an RS-type instruction specifies base register 3, it means instead to use "no" base register.

What does it mean to use "no" base register?  Several pages later, we're told that "When B2 equals 11, base addressing is not performed. In this case, the displacement is instead used directly as the effective address."  In other words, the displacement is the number of halfwords, within the same CSECT, from the instruction following the RS instruction to the target location.

As I said, there is no syntactic way for the assembler to distinguish between these cases, but there is a non-syntactic way based on the size of the D2 sub-operand.  If D2 is in the numerical range 0-55, then the SRS instruction could potentially be used, while if D2 is 56 or greater, the RS instruction must be used.  Unfortunately, determining these displacements between two locations is quite tricky, because there might be some of these SRS/RS instructions with ambiguous sizes in between.  In fact, for forward references, there is guaranteed to be at least one such intervening instruction.  In other words, we don't know the sizes of the displacements until we know the sizes of the intervening instructions, and we don't know the sizes of the intervening instructions until we know the sizes of all of the displacements.

In so far as the ambiguity between RS-type "extended" instructions vs "indexed" instructions, this is a lesser problem than the SRS-vs-RS problem, because both extended and indexed varieties of the instruction assemble to a fullword, and thus the use of the wrong variation of the instruction results only in a mismatch at that exact memory location rather than a general displacement of all the succeeding memory locations.  As with the SRS-vs-RS instructions, there are cases in which the variation in syntax does allow distinguishing the two cases.  Roughly speaking, the syntax of extended vs indexed RS-type instructions is

OPCODE		R1,D2(B2)
OPCODE[@][#] R1,D2(X2,B2)

but since the @, #, and X2 are all optional syntactically, in their absence there's no way to distinguish between extended vs indexed instructions other than the numerical range allowed for the displacement D2.  As well, the latter (indexed) of the two cases allows the syntax

OPCODE		R1,D2(X2)

for an even greater potential for confusion.  Yay!  Fortunately, allowed base registers (B2) for AP-101S are only the CPU general registers 0-3, so if general register 4-7 appear in such an instruction, it's clear that index register X2 is meant rather than base register B2.

In lieu of any foolproof algorithm for resolving these ambiguities, as well as lacking knowledge of whatever algorithm the original AP-101S assembler used, ASM101S cannot be guaranteed to produce the same choices of SRS instructions versus RS instructions made by the original assembler.  All I can say is that I've tried to craft ASM101S in such a way as to match when assembling existing Shuttle flight software.

For a clearer explanation, perhaps it's best to refer to the rules of the card game known as Fizzbin which was developed at roughly the same time.

Uninitialized Memory

Although unstated in the original documentation, I believe that the original AP-101S assembler filled uninitialized memory with the halfword pattern 0xC9FB.  This happens to be the first halfword of the AP-101S instruction SVC, which as the AP-101S POO explains (p. 9-16), "causes an interruption and a program status word switch".  (The second halfword is the operand for SVC.)  Whether the consistent use of this particular value is significant or merely a coincidence, I can't say; perhaps the intention was to use this to trap unintentional execution from uninitialized memory.

The reason I suspect that this is the case is that in surviving legacy assembly reports, 0xC9FB is inevitably at locations where there's a gap due to forcing alignment of data to particular boundaries, such as aligning fullword constants to fullword addresses.  For example, consider this excerpt from the assembly report for the ACOS module:

					     .
.
.
0005D 58E0 182 RET0 SER F0,F0 00010504
0005E DF4E 004C 0013 183 B EXIT 00010600
184 * 00010700
0005F C9FB
00060 185 DS 0F 00010800
00060 413243F7 186 PI DC X'413243F7' PI 00010900
00062 411921FB 187 PIOV2 DC X'411921FB' PI/2 00011000
.
.
.

Here, the instruction B EXIT at halfword address 0x0005F is immediately followed by the pseudo-op DS, whose purpose is to align to the next fullword address boundary (i.e., the next even halfword).  In the absence of realignment, though, the next halfword address after the branch instruction would have been 0x0005F (which is an odd halfword address).  Therefore, the assembler inserts a halfword at 0x0005F, so that DS can appear instead at 0x00060 (which is now an even address address).  The value of that halfword is 0xC9FB.

@# Syntax for Second Operands

AP-101S instructions of type RS (such as LH, STH, etc.) accept an operand syntax of which the most-general form is:

OPCODE	R1,D2(X2,B2)

where R1 represents a general-purpose register designated as "operand #1", D2 represents a "displacement" (which may take the form of a number or a program label), X2 represents a general-purpose register (designated as the "index register"), and B2 represents yet another general-purpose register (designated as the "base register").

That said, there is a bewildering number of exceptions, restrictions, and rules for interpreting these different elements.  Additionally, each RS-type instruction can be assembled into two separate forms of machine instructions, namely the so-called "extended" and "indexed" forms, as well as sometimes into a third form, the so-called SRS form discussed a couple of sections ago.  It's not my intention to explain all of these details — I don't understand quite a few of them myself, anyway! —, and most of them are explained in the AP-101S POO anyway.

But by saying that most of syntactical variations are explained by the POO, at least in spirit, I'm also implying that some of them are not ... and I'd like to supply those missing explanations that pertain to the operation of the assembler.  In particular, most of the four syntactical elements mentioned above may be omitted under various circumstances. 

Before getting to that, another thing you need to know is that many of these RS-type instructions have related instructions in which the characters "@", "#", or "@#" are suffixed to the mnemonic.  For example, I mentioned above that LH and STH are RS-type instructions, but so too are LH@, LH#, LH@#, STH@, STH#, and STH@#.  The AP-101S POO explains these variations as follows:

The omissions of syntactical elements which I'm concerned about here are those of the X2 or B2 elements, resulting in operands like

OPCODE 	R1,D2(B2)
OPCODE@ R1,D2(X2)

Insofar as assembly is concerned, the question that arises is this:  If you have an operand syntax like R1,D2(register), is register supposed to be an index register, or is it supposed to be a base register?  My inference is that it is supposed to be a base register if there is no @ or # suffix on the mnemonic, but it is supposed to be an index register for the @/#/@# forms of the mnemonic.  In the case where register is a base register and no index register is specified, the "extended" form of machine instruction is generated (and it requires no index register); whereas in the case where register is an index register and no base register is specified, the "indexed" form of machine instruction is generated and the base register defaults to general-purpose register 0.

Fixed-Point Data

By "fixed-point" data, I'm referring to data specified in operands of DC pseudo-ops, à la

DC    F'12.345'		(fullword fixed-point)
DC H'-6.12' (halfword fixed-point)

Here we have a case in which the usage in AP-101S assembly language, per the AP-101S POO, is entirely at odds with the usage in System/360 BAL, per the System/360 assembly-language manual.  In System/360, the non-integer portions of these specifications are discarded (although optional "exponent factors" and "scale factors" can be used to apply binary or decimal shifts to the data before doing so, thus preserving as much significance in the fractional part as may be desired).

For AP-101S, on the other hand, there are two very different cases.  First, if the data does not contain a decimal point or exponent, then what is generated for it in memory is simply the 4-byte or 2-byte 2's-complement representation of the integer value.

Second, if the data does contain a decimal point or exponent, then the full value is entirely fractional:  i.e., >-1.0 and <1.0.  The constant is assembled to a binary value basically by multiplying by 231 (in the case of F'...') or by 215 (in the case of H'...'), discarding the fractional portion, and representing the remaining integer in 2's-complement form.

For example,

DC	H'0.625'

generates 0.625×215 = 20480 = 0x5000.

It's TBD what should happen if there is a non-zero integer portion, but ASM101S simply caps the generated value to the boundaries of the representable range, which essentially means that any integer portion is transparently dropped.

"Linking" By the Assembler Itself

The AP-101S assembler performed a kind of partial linking of the object code, prior to any linking of separately-compiled modules by the AP-101 link editor.  See the discussion on the HALLINK101S page.

Structure of ASM101S, for Maintainers

This section won't be of interest to anybody who merely wishes to use ASM101S.  But given that I haven't necessarily been able so far to provide support in ASM101S for every feature of AP-101 assembly language that might be discovered if additional legacy code becomes available in the future, ASM101S may need to be maintained in the future.  So it may be worthwhile to provide at least a few notes on how ASM101S is structured internally, in order to facilitate that possible maintenance.

The Python 3 source code for ASM101S is kept in the ASM101S/ folder of the Virtual AGC software tree, and the top-level source-code file is itself called ASM101S.py.  There are also a number of additional Python files in that folder which are imported as modules into ASM101S.py or into each other.  I should note that there are other Python files in that folder that I find useful, but that are not used as modules in ASM101S.

Assembly proceeds in a sequence of "passes", some of which are designated as passes in the source code, and some of which are not.  Here's a brief runthrough of the passes:

Some relevant Python modules of interest:

Afterthought:  HLASM

Belatedly, someone pointed out a presentation from 2010 called "Assembler Language as a Higher Level Language: Conditional Assembly and Macro Techniques" by John R. Ehrman of IBM as a possible resource for deciphering conundrums about gaps in the System/360 assembly-language manual vs AP-101S.   Admittedly, John's presentation concerns the so-called "High Level Assembler for z/OS, z/VM, and z/VSE", apparently still being sold for IBM mainframes under the name "HLASM".  Not being steeped in IBM lore myself, I can only make assumptions as to how how relevant HLASM may be to the AP-101S assembler.  There's a lot of HLASM stuff that's definitely not available (or at least never used if it is available) in AP-101S source code, though it's unclear how much of it is flatly inconsistent with AP-101S.  IBM itself provides a manual for HLASM that's you can find online.

In brief, I'll say only that this presentation supports some reverse-engineering assertions I've made earlier on this page, while contradicting some facts about AP-101S code.  Thus you may or may not find information from HLASM helpful.



This page is available under the Creative Commons No Rights Reserved License
Last modified by Ronald Burkey on 2024-11-29

Virtual AGC is
              hosted by ibiblio.org