(HALLINK101S does not exist at present.
This page collects my preliminary thoughts
 about it, but particularly about object files.)

Table of Contents

Introduction

HALLINK101S is our "modern" version of the link editor used to combine IBM AP-101 object files created by the HAL/S language compiler and AP-101S assembler into "load files" which can be directly executed on an IBM AP-101 computer or emulator thereof.  To the best of my knowledge, the original linker actually used for Shuttle software development has not survived, so HALLINK101S is an entirely-new creation.  It is a Python 3 program that should work on any computer having the Python 3 language installed.

Regarding the IBM AP-101B vs AP-101S computers, the AP-101B was used in earlier Shuttle flights, while the AP-101S was used in later flights.  HALLINK101S is completely agnostic on the question of AP-101B vs AP-101S.

I am not aware of surviving documentation describing the format of AP-101S object files specifically, nor of any surviving AP-101 object-code files from the Shuttle era.  In any case, I think that such object code would have been produced not as "files" in a mass-storage system, but rather as decks of computer punch-cards punched directly by the original HAL/S compiler (HAL/S-FC) or AP-101 assembler.  So it is possible that the decks of punch cards could still exist, hidden away somewhere, even if not as online files.  In any case, I have no access to them, and have heard nary a hint of their existence, and I could be mistaken in my inferences about them anyway.

The original linker is referred to in the contemporary documentation simply as "the AP-101 link editor", and the contemporary documentation goes on to say that "until the final version of the linkage editor is available to support all features of the HAL/S-FC compiler, an interim program, known as HALLINK-FC, will be supplied to perform the link edit functions".  As usual, satisfactory documentation for neither "the AP-101 link editor" nor for HALLINK-FC (nor for its System/360-target predecessor HALLINK) is available to us.  Indeed, I do not even know if the "the AP-101 link editor" was ever completed satisfactorily to supersede HALLINK-FC. My intuition, based on nothing concrete, is that it was not, and that HALLINK-FC was used until the end of the Shuttle project.  But since we have access to neither of those linkers, any intuition of mine about them is without value.

On the other hand, we do have a modern port (from the XPL programming language into the Python programming language) of the original HAL/S compiler.  That being the case, we can use the compiler to convert HAL/S source code into AP-101S object files, and in principle we might be able to do whatever reverse engineering is required on those object files to clarify issues regarding the object-file format.  In attempting that kind of reverse engineering, it quickly becomes apparent that the AP-101S object-file format is based upon the format of early object files for the IBM System/360, if not strictly identical to it.  In other words, we can use the early System/360 format as our guide, and occasionally depart from it as needed on an ad hoc basis.  Some relevant references are listed in the next section.
Aside:  It appears to me that in IBM parlance, what I call an "object file" would be a referred to as an "object deck", presumably because it was originally directly punched onto a stack of computer punch-cards.  I'm not telling you this just to be pedantic — though obviously, that's a thing I do — but rather to let you know that if you google it yourself, you're likely to have much better luck searching for "decks" instead of "files".

As usual, not all of the System/360 features are needed for actual legacy AP-101S software, so only those features of the object-file format relevant to that surviving software are likely to be implemented in HALLINK101S; contrariwise, some non-System/360 AP-101S-specific features may appear in the object files.  For example, "records" in an object file are described as being of 6 types, known as ESD, TXT, RLD, SYM, XSD, and END, but the XSD type wasn't introduced until the 1990's in order to deal with the needs of C compilers, and hence XSD records don't appear at all in HAL/S or AP-101S object files.  But I won't trouble myself or bore you with an exhaustive list of such differences here. 

To facilitate whatever reverse engineering is needed, I've provided a Python 3 program that can parse the object files produced by the HAL/S compiler (or by ASM101S) and report on all of the data found within them.  This will be covered in more detail later.

References

Installation of HALLINK101S

TBD

Use of HALLINK101S

TBD

Reverse-Engineering Assistance

To facilitate reverse engineering of the AP-101 object-file format, I've provided a Python 3 module (readObject101S.py), which also can be used as a stand-alone program, to parse the object files produced by the HAL/S compiler (or by ASM101S).  For example, consider this HAL/S program 140-STATISTICS from the book Programming in HAL/S:
  STATISTICS:
PROCEDURE(DATA) ASSIGN(LO_VAL, HI_VAL, MEAN);
DECLARE DATA ARRAY(*) SCALAR;
DECLARE SCALAR,
LO_VAL, HI_VAL, MEAN;
LO_VAL = MIN(DATA);
HI_VAL = MAX(DATA);
MEAN = SUM(DATA) / SIZE(DATA);
CLOSE STATISTICS;

Upon compiling this file with the modern HAL/S compiler (HALSFC),

HALSFC 140-STATISTICS.hal "" "LIST,VARSYM,DECK"

we get not only a code-generation report (pass2.rpt), which in abridged form looks like the following,

PAGE 2
SYMBOL TYPE ID ADDR LEN(HEX) LEN(DEC) BLOCK NAME


#CSTATIS SD 0001 000000 00001D 29 STATISTICS
#ZSTATIS SD 0002 000000 000002 2
#DSTATIS SD 0003 000000 000002 2
#QEMIN ER 0004
#QEMAX ER 0005
#QESUM ER 0006
------------------------------------------------------------------------------------------------------------------------------------------------------
PAGE 3
LOC CODE EFFAD LABEL INSN OPERANDS SYMBOLIC OPERAND


0000000 ST#1 EQU *
00000 #ZSTATIS CSECT ESDID= 0002
00000 00000E00 DC A'00000E00' STATISTICS
00000 #CSTATIS CSECT ESDID= 0001
0000000 STATISTI EQU * STATISTICS
00000 E9F3 0000 LHI R1,0() TIME: 0.25; #DSTATIS
00002 B914 0005 STH R1,5(R0) TIME: 0.5
00003 E0FB 0018 IAL R0,24() TIME: 0.5
00005 EB01 0000 LA R3,0(R1) TIME: 0.25
00006 BB24 0009 STH R3,9(R0) TIME: 0.5
0000007 ST#2 EQU *
0000007 ST#3 EQU *
0000007 ST#4 EQU *
00007 9A30 000C LH R2,12(R0) TIME: 0.25; DATA
00008 1D1C 000E L R5,14(R0) TIME: 0.25; DATA+2
00009 E4F7 3800 BAL@# R4,0(R1,R3) TIME: 10.0 (SEE POO); #QEMIN
0000B 9B40 0010 LH R3,16(R0) TIME: 0.25; LO_VAL
0000C 3803 0000 STE F0,0(R3) TIME: 0.5
000000D ST#5 EQU *
0000D 9A30 000C LH R2,12(R0) TIME: 0.25; DATA
0000E 1D1C 000E L R5,14(R0) TIME: 0.25; DATA+2
0000F E4F7 3800 BAL@# R4,0(R1,R3) TIME: 10.0 (SEE POO); #QEMAX
00011 9B48 0012 LH R3,18(R0) TIME: 0.25; HI_VAL
00012 3803 0000 STE F0,0(R3) TIME: 0.5
0000013 ST#6 EQU *
00013 9A30 000C LH R2,12(R0) TIME: 0.25; DATA
00014 1D1C 000E L R5,14(R0) TIME: 0.25; DATA+2
00015 E4F7 3800 BAL@# R4,0(R1,R3) TIME: 10.0 (SEE POO); #QESUM
00017 1E1C 000E L R6,14(R0) TIME: 0.25; DATA+2
00018 3AEE CVFL F2,R6 TIME: 1.75
00019 68E2 DER F0,F2 TIME: 7.25
0001A 9B50 0014 LH R3,20(R0) TIME: 0.25; MEAN
0001B 3803 0000 STE F0,0(R3) TIME: 0.5
000001C ST#7 EQU *
000001C LBL#2 EQU *
0001C 97E8 SRET 7,R0 TIME: 17.5
00002 #DSTATIS CSECT ESDID= 0003
00002 000000 ORG *-2
00000 000002 ORG *+2
END

but also an object file (cards.bin).  Parsing that object file using the reverse-engineering program,

readObject101S.py cards.bin

we obtain something similar to the following report:

0000: 	type=SYM ident="I**20001" size=002E
0050: type=SYM ident="I**20002" size=002E
00A0: type=SYM ident="I**20003" size=0022
00F0: type=ESD ident="I**20004" size=0030 esdid=0001
symbol1: name="#CSTATIS" type=SD address=000000 length=003A AMODE24 RMODE24 RW
symbol2: name="#ZSTATIS" type=SD address=000000 length=0004 AMODE24 RMODE24 RW
symbol3: name="#DSTATIS" type=SD address=000000 length=0004 AMODE24 RMODE24 RW
0140: type=ESD ident="I**20005" size=0030 esdid=0004
symbol1: name="#QEMIN " type=ER
symbol2: name="#QEMAX " type=ER
symbol3: name="#QESUM " type=ER
0190: type=TXT ident="I**20006" offset=000000 size=0004 esdid=0002
data: 00 00 0E 00
01E0: type=TXT ident="I**20007" offset=000000 size=0004 esdid=0003
data: 00 00 00 16
0230: type=TXT ident="I**20008" offset=000000 size=0038 esdid=0001
data: E9 F3 00 00 B9 14 E0 FB 00 18 EB 01 BB 24 9A 30
1D 1C E4 F7 38 00 9B 40 38 03 9A 30 1D 1C E4 F7
38 00 9B 48 38 03 9A 30 1D 1C E4 F7 38 00 1E 1C
3A EE 68 E2 9B 50 38 03
0280: type=TXT ident="I**20009" offset=000038 size=0002 esdid=0001
data: 97 E8
02D0: type=RLD ident="I**20010" size=0028
relocation=0003 position=0001 flags=(0,0,A,1,0,0) address=000002
relocation=0004 position=0001 flags=(0,0,A,1,0,0) address=000014
relocation=0005 position=0001 flags=(0,0,A,1,0,0) address=000020
relocation=0006 position=0001 flags=(0,0,A,1,0,0) address=00002C
relocation=0001 position=0002 flags=(0,0,V,1,0,0) address=000000
0320: type=END ident="I**20011" idrType="2"
translator="HAL/SREL3 V0 24331"
processor="RSB-XCOM-I000924239"
--------------------------------------------------------------------------------
SYM-Record Summary:
CONTROL offset=000000 name="#CSTATIS"
DUMMY offset=000000 name="STACK"
DATA offset=000030 name="STACKEND" datatype=H
DUMMY offset=000000 name="HALS/FC"
DUMMY offset=000000 name="HALS/END"
CONTROL offset=000000 name="#CSTATIS"
INSTRUCTION offset=000002 name="D24331"
INSTRUCTION offset=000002 name="T2883796"
CONTROL offset=000000 name="#ZSTATIS"
DATA offset=000000 datatype=Z
CONTROL offset=000000 name="#DSTATIS"
DATA offset=000000 datatype=Z
Of course, the documentation needs to be consulted even to understand the parsed information in the report in any detail.  Nevertheless, even on a superficial reading, this report tells us that the object file contains 11 "records", of type SYM (3), ESD (2), TXT (4), RLD (1), and END (1), and reveals how various features easily visible in the HAL/S source code are represented in the object file.  You may notice that the report doesn't really provide much information about individual SYM (symbol-table) records.  That's because all SYM records need to be conjoined end-to-end, and it is only the conjoined record that contains parsable information; whereas the individual SYM records are not separately parsable.  Parsing the conjoined record is where the "SYM-Record Summary" at the end of the report comes from.

In the reverse-engineering report we see items obviously related in a general way to the HAL/S source code, though the relationship is not immediately clear in detail, such as the symbols #CSTATIS, #ZSTATIS, #DSTATIS, #QEMIN, #QEMAX, and #QESUM.  Some of the details are filled in by the "Basic HAL/S Programming" course's section "HAL/S CSECTS".  That course explains that each HAL/S "compilation unit" is automatically assigned a 6-character "generic name" by removing all underscores and truncating to 6 characters.  (It's up to the programmer to insure that all of the generic names are unique!)  Since in the example we're using the block being compiled is named STATISTICS, the generic name is just "STATIS".  The compiler then generates names for the various control sections it likes to create by prefixing various 2-character strings to the generic name.  You can see the hopefully-full list of prefixes at the hyperlink just given, but the ones relevant to our particular example are:
Aside:  A COMSUB is a HAL/S compilation unit containing data or code (such our STATISTICS PROCEDURE) accessible by other compilation units.  ZCON code and REMOTE data are "far away" in the AP-101S address space from the code accessing them, and hence are accessed by different AP-101S instruction types from "nearby" data or code. The functions EMIN, EMAX, and ESUM are AP-101S assembly-language functions from the HAL/S runtime library.

Alas, the object file is also seen to have a number of items not directly observable in the original HAL/S code but rather presumably transparently generated by the HAL/S compiler itself; examples are the symbols STACK, STACKEND, HALS/FC, HALS/END, D24324, and so on.

Some of these items (symbols STACK, STACKEND) appear in the assembly-language source code for HAL/S runtime-library files like ACOS, and thus apparently relate to common features HAL/S uses internally.  Specifically, I believe the report may be telling us that the DSECT called STACK is 0x30 (STACKEND) halfwords in length. 

Aside: Recall that because of the nature of the HAL/S language, the HAL/S compiler can determine at compile time the maximum stack required by any compilation unit, although the assembler cannot do so for assembly-language  files.

Some other non-obvious symbols (D24324, T4041882) seem by my reckoning to relate to embedded datestamps and timestamps:
Still other symbols (such as HALS/FC and HALS/END) remain to be interpreted.  Of course, I can invent explanations for anything: Perhaps the presence of symbol HALS/FC is simply the compiler's way of telling us that the compilation was performed by the HAL/S-FC compiler (as opposed to the HAL/S-360 compiler).  But then, perhaps not.  Really, I haven't a clue!  If we're lucky, time will tell.  I suppose that if reverse engineering obsolete object-file formats were easy, those formats would be included as free bonus items in boxes of breakfast cereal.

See the source code for readObject101S.py for further documentation, if any.

Unexpected Hurdles

Here are some things I find that differ between AP-101 object files and System/360 object files, or at least which are unexpected from reading the System/360 documentation.

Partial Linking By the Assembler Itself

The expected:  In System/360 assembly language, code and data are partitioned into a series of named "control sections", generically referred to as CSECTs.  It should be the case that from the assembler's point of view, "addresses" begin at 0 and increment upward in each section, and that the linker program subsequently reassigns different starting addresses to the sections generated by the assembler, while adjusting all of the addresses referenced within the various sections accordingly.

The unexpected:  However, it is clear from the surviving legacy AP-101S assembly listings that there are departures from these principles.  Consider the following excerpt of the legacy assembly listing for the SQRT function of the HAL/S-FC runtime library:
                                 18 SQRT     AMAIN INTSIC=YES                                               00001300
19+***********************************************************************
20+*
21+* PRIMARY ENTRY POINT
22+*
23+***********************************************************************
00000 24+SQRT CSECT 01-AMAIN
26 * COMPUTES SQUARE ROOT IN SINGLE PRECISION 00001400
27 INPUT F0 SCALAR SP 00001500
0000000 28+F0 EQU 0 01-INPUT
30 OUTPUT F0 SCALAR SP 00001600
32 WORK R1,R5,R6,R7,F1,F2,F3 00001700
0000001 33+R1 EQU 1 01-WORK
0000005 34+R5 EQU 5 01-WORK
0000006 35+R6 EQU 6 01-WORK
0000007 36+R7 EQU 7 01-WORK
0000001 37+F1 EQU 1 01-WORK
0000002 38+F2 EQU 2 01-WORK
0000003 39+F3 EQU 3 01-WORK
00000 E9F3 0032 0032 41 LA R1,A 00001800
0000032 42 USING A,R1 00001900
00002 7AE0 43 START LER F2,F0 00002000
00003 DE88 0026 0022 44 BNP ERROR ARGUMENT NEGATIVE OR ZERO 00002100
45 * 00002200
00004 27E8 46 LFXR R7,F0 TRANSFER TO GENERAL REGISTER 00002300
00005 76E6 47 XR R6,R6 00002400
00006 FE1C 0007 48 SLDL R6,7 Q & MANTISSA IN R7 00002500
00007 F660 0018 49 SLL R6,24 CHAR OF ANSWER - (Q+32) IN R6 00002600
00008 1FE7 50 LR R7,R7 CHECK FOR Q=1 00002700
00009 DD0C 000D 0003 51 BCF 5,GORP 00002800
52 * 00002900
0000A B0E6 0100 0100 53 AHI R6,X'0100' ADD 1 TO CHAR FOR Q=1 00003000
0000C E909 0002 54 LA R1,2(R1) FULLWORD INDEX 00003100
55 * 00003200
0000D 1DE6 56 GORP LR R5,R6 CHARACTERISTIC OF ANSWER IN R5 00003300
0000E F705 0001 57 SRA R7,1 FIXED M/2 + 'QQ' IN TOP BITS 00003400
0000F 0711 003A 58 A R7,C ADD C/2, AND ELIMINATE 'QQ' 00003500
00010 1E09 0036 59 L R6,B -B/2 OR -B/8 AT BIT 7(MANTISSA POS.) 00003600
00011 4EE7 60 DR R6,R7 (R6) = (4**(-Q))(-B/(C+M)) AT BIT 7 00003700
00012 0601 0032 61 A R6,A A OR A/4 AT BIT 7 + CHAR=32 00003800
00013 06E5 62 AR R6,R5 RESTORE CHARACTERISTIC OF ANSWER 00003900
00014 29EE 63 LFLR F1,R6 00004000
64 * 00004100
65 DROP R1 00004200
66 * 00004300
67 * TWO PASSES OF THE NEWTON-RAPHSON ITERATION 00004400
68 * 00004500
00015 68E1 69 DER F0,F1 00004600
00016 50E1 70 AER F0,F1 00004700
00017 7BF7 0013 002C 0013 71 LE F3,FHALF 00004800
00019 60E3 72 MER F0,F3 00004900
0001A 6AE0 73 DER F2,F0 00005000
74 * 00005100
0001B B6E6 FF00 FF00 75 NHI R6,X'FF00' PUT CHARACTERISTIC OF 00005200
0001D 06F7 000F 002E 000F 76 A R6,ROUND ANSWER IN ROUND 00005300
0001F 29EE 77 LFLR F1,R6 DIGIT, AND ADD 00005400
00020 50E1 78 AER F0,F1 TO INTERMEDIATE RESULT 00005500
79 * 00005600
00021 58E2 80 SER F0,F2 00005700
00022 60E3 81 MER F0,F3 00005800
SQRT -- SINGLE PRECISION SQUARE ROOT FUNCTION PAGE 3
LOC OBJECT CODE ADR1 ADR2 SOURCE STATEMENT AP101S 3.0 09.39 07/22/05
00023 50E2 82 AER F0,F2 ANSWER IN F0 00005900
83 * 00006000
84 EXIT AEXIT AND RETURN 00006100
85+*********RETURN TO CALLER**********************************************
00024 86+EXIT DS 0H 01-AEXIT
00024 9914 0005 87+ LH 1,5(0) RESTORE PROGRAM DATA BASE 01-AEXIT
00025 C7EC 88+$RET1 BCRE 7,4 RETURN TO CALLER 01-AEXIT
89+***********************************************************************
91 * 00006200
00026 DC0E 0024 0003 92 ERROR BCB 4,EXIT EXIT IF ARG=0 00006300
93 AERROR 5 ARGUMENT<0 00006400
94+*********ISSUE SEND ERROR SVC******************************************
00027 C9FB 0030 0030 95+ SVC AERROR1 ISSUE SEND ERROR SVC 01-AERRO
96+*********SEND ERROR SVC RETURNS CONTROL FOR STANDARD FIXUP*************
00029 78E8 97 LECR F0,F0 FIXUP: GET |ARG| 00006500
0002A DFA6 0002 0029 98 B START AND TRY AGAIN 00006600
99 * 00006700
0002B C9FB
0002C 40800000 100 FHALF DC E'0.5' 00006800
0002E 00000001 101 ROUND DC X'00000001' 00006900
102 * 00007000
103 ADATA 00007100
104+*********DATA CSECT****************************************************
00030 105+ LTORG 02-ERRPA
106+****************ERROR PARAMETER AREA***********************************
00030 107+#LSQRT CSECT 02-ERRPA
108+*** SQRT SENDS THE FOLLOWING ERROR 02-ERRPA
110+*** ERROR NUMBER 5 IN GROUP 4 02-ERRPA
00030 0014 112+AERROR1 DC H'20' SVC CODE FOR SEND ERROR 02-ERRPA
00031 0405 113+ DC Y(4*256+5) 8 BIT GROUP AND NUMBER 02-ERRPA
114+****************END OF ERROR PARAMETER AREA****************************
00032 115+#LSQRT CSECT 01-ADATA
00032 21AE7D00 116 A DC X'21AE7D00' 1.6815948=A + X'20' 00007200
00034 206B9F40 117 DC X'206B9F40' 0.4203987=A/4 + X'20' 00007300
00036 FF5B02F1 118 B DC X'FF5B02F1' -1.2889728=B 00007400
00038 FFD6C0BD 119 DC X'FFD6C0BD' -0.3222432=B/4 00007500
0003A 35CFC610 120 C DC X'35CFC610' 0.8408065=C/2 00007600
0003C 75CFC610 121 DC X'75CFC610' 0.8408065=C/2 + X'40' 00007700
122 ACLOSE 00007800
123+ END 01-ACLOS

As I've highlighted in red, there are two CSECTs in this excerpt, and they are called "SQRT" and "#LSQRT".  (The #LSQRT section has two separate CSECT statements, but don't be distracted by that, as it's not significant in respect to the points I want to make.  Also, don't be distracted by the lines I've highlighted in green, which I'll discuss in a moment.)

Unlike the expectation we (or at least I) would have, The #LSQRT section's addresses do not restart at 0, and instead the assembler continues to assign it addresses which continue incremented from the end of the SQRT section.  It is as if the assembler has already performed the step of partially linking these two sections of the code.  Why is this?

One might argue that while whoever coded the original assembler thought (for some unknown reason) that it was a good idea to present the assembly report in this fashion, it really doesn't make any difference:  As long as the object file produced by the assembler still represents these as independent sections, the linker can still reassign the addresses however it likes, with the result that after the final linking it may no longer be the case that the #LSQRT section immediately follows the SQRT section in memory.

Alas, that rationalization is incorrect.  Consider now the lines highlighted in green, and in particular this instruction:

LA	R1,A

For the assembler to assemble an instruction like this, in which a "base register" is not explicitly stated as an operand itself, it needs to somehow determine for itself the appropriate base register.  There are only two ways this can happen:

  1. If the operand (A in this case) is in that same code section as the instruction itself, in which case the appropriate base register is general register 3.
  2. Or else the operand is in an address block which has been made known the assembler via a previous USING pseudo-op.  The USING pseudo-op informs the assembler as to what assumption it can make about the contents of the various general registers.

In this example, though, neither of these conditions obtains.  As for condition #1, the instruction appears in the SQRT code section, but its operand (A) appears in a different section, namely #LSQRT.  As for condition #2, the only USING pseudo-op follows the instruction rather than precedes it, so there is no way for the assembler to make any assumptions about register contents.

How, then, has the assembler been able to assemble this LA instruction?  Well, perhaps there are possibilities that haven't occurred to me, but it appears to me that there's only one explanation:  The assembler really has already linked the SQRT and #LSQRT code sections, so that they are in a fixed relationship to each other, and in particular there is really no separate #LSQRT section as such in the object file output by the assembler.  I.e., #LSQRT has become just a label for an address within the SQRT section, and the actual linker program cannot change this at any later time.

Non-System/360 Datatype 0x84

To understand the comments below, it may help to refer back to the earlier example of a reverse-engineering report.

In so-called SYM records within an object file, each symbol representing "data" has an associated "datatype", consisting of a single-byte number.  For example, the datatype 0x00 represents character data (C), 0x04 represents hexadecimal data (X), 0x18 represents single-precision floating-point (E), and so on.  However, there's one datatype in AP-101S object files, 0x84, that doesn't appear in the available system/360 object-file documentation, leading us to speculate that it represents some AP-101S-only datatype not available on System/360. 

As it happens, there's discussion on our page about the modern AP-101 assembler that concerns the available datatypes.  That discussion points out a single AP-101S-specific datatype, Z, that's speculated to be unrelated to the System/360 datatype also designated as Z ("zoned decimal", 0x34).  For AP-101S, it was speculated that the Z datatype instead was something called a ZCON, which as kind of long-range address constant for items not present in nearby memory and hence addressable by only a restricted class of instructions. 

We are thus tempted to equate datatype 0x84 in SYM records with AP-101S ZCONs.  At the moment this is pure speculation that perhaps doesn't fit perfectly, but the reverse-engineering report nevertheless treats it as factual and reacts to these datatypes as follows:

Encoding of Character Data in Object Decks

In System/360 object files, all character data is encoded in EBCDIC.  This is not true for AP-101 object files. Some text is encoded instead in the character set of the Space Shuttle's Display Electronics Unit (DEU).

The DEU character set is depicted in the table below.  It is an ASCII-like character set, in the sense that almost everywhere the printable characters or control codes overlap with printable ASCII characters or control codes, the numerical encoding matches ASCII as well.

Reverse engineering reveals at least some of the cases in which DEU encoding is used in place of EBCDIC in the object files:

Classification of Textual Data
Encoding
Textual data used only at compile/assembly/link time, but not accessible to the executable AP-101 program itself at runtime.  Examples of this kind of data include (but are not limited to):
  • Names of symbols appearing in the source code or transparently generated by the compiler or assembler.
  • Names of object-file record types ("SYM", "ESD", "TXT", "REL", "END").
  • "Ident" fields (columns 72-80) of assembly-language source-code cards.
EBCDIC
Character constants appearing in HAL/S source code, such as DECLARE CHARACTER CONSTANT statements, DECLARE CHARACTER INITIAL constants, or character literals. DEU
Character data appearing in AP-101 assembly-language source code, such as DC pseudo-ops or so-called "literals" like "=C'...'".
TBD (currently, assumed EBCDIC)

Consider this HAL/S program adapted from the book Programming in HAL/S, with a few of its character-string literals highlighted in various colors:
  ROOTS: PROGRAM;
DECLARE SCALAR, A, B, C, D, ROOT1, ROOT2;
DO WHILE TRUE;
WRITE(6) ;
WRITE(6) 'Enter A, B, C (0,0,0 to quit):';
READ(5) A, B, C;
IF A = 0 AND B = 0 AND C = 0 THEN
DO;
WRITE(6) 'Quitting ...';
EXIT;
END;
D = B**2 - 4 A C;
IF D >= 0 THEN
DO;
D = D**0.5;
ROOT1 = (-B + D) / (2 A);
ROOT2 = (-B - D) / (2 A);
WRITE(6) 'Real roots of', A, 'X**2 +', B, 'X +', C,
'are:', ROOT1, ROOT2;
WRITE(6) 'Check:',
A ROOT1**2 + B ROOT1 + C,
A ROOT2**2 + B ROOT2 + C;
END;
ELSE
DO;
TEMPORARY RE, IM, RE2, IM2;
D = (-D)**0.5;
RE = -B / (2 A);
IM = D / (2 A);
WRITE(6) 'Complex roots of', A, 'X**2 +', B, 'X +',
C, 'are:', RE, '+/-', IM, 'i';
RE2 = RE**2 - IM**2; /* Square of RE2 +/- IM2 i. */
IM2 = 2 RE IM;
WRITE(6) 'Check:', A RE2 + B RE + C, '+/-',
A IM2 + B IM, 'i';
END;
END;
CLOSE ROOTS;
In comparison, here's an excerpt from the reverse-engineering report on the object-code file, with just some of the TXT records, and some portions of the TXT records that seemingly correspond to the literal strings highlighted in colors matching the HAL/S listing above:
	.
.
.
05A0: type=TXT ident="I**20019" offset=000024 size=0038 esdid=0003
data: 00 01 69 00 00 03 2B 2F 2D 00 00 10 43 6F 6D 70
6C 65 78 20 72 6F 6F 74 73 20 6F 66
00 06 43 68
65 63 6B 3A
00 04 61 72 65 3A 00 03 58 20 2B 00
00 06 58 2A 2A 32 20 2B
05F0: type=TXT ident="I**20020" offset=00005C size=0036 esdid=0003
data: 00 0D 52 65 61 6C 20 72 6F 6F 74 73 20 6F 66 00
00 0C 51 75 69 74 74 69 6E 67 20 2E 2E 2E 00 1E
45 6E 74 65 72 20 41 2C 20 42 2C 20 43 20 28 30
2C 30 2C 30 20 74

0640: type=TXT ident="I**20021" offset=000092 size=0008 esdid=0003
data: 6F 20 71 75 69 74 29 3A
.
.
.
You can verify for yourself, if you like, that the colored portions are encoded as DEU/ASCII, and hopefully take my word for it (since I wrote it!) that the reverse-engineering report generator uses EBCDIC-to-ASCII conversion to print out messages such as ident="I**20021".
Aside:  There's no easy way, as far as I know, using just information within the object decks themselves, to easily locate individual unnamed string literals within the TXT records of the object decks.  The way I did it above, was to assume that ASCII coding was used, and to look for the ASCII strings ... which I admit was a bit of a cheat, even though it doesn't invalidate the conclusions in any way.  In the next section, we'll learn enough to infer that the these strings should have been found in a CSECT called #DROOTS.  And in portions of the reverse-engineering report that I've omitted above, we would have seen that there is in fact a CSECT called #DROOTS and that it is assigned the ESDID 0003, which is the case in the excerpt above.  So indeed, everything is as it should be in this the best of all possible worlds.

PROGRAM Compilation Units

Object files generated from HAL/S source code containing a PROGRAM (i.e., the top-level compilation unit of a program, rather than a mere component of a program such as the PROCEDURE) have object files differing from the format described in the System/360 documentation and roughly covered in examples of reverse engineering above.

Such an object file begins with a block of records in the almost-System/360 format we've already described, but differs after an END record is encountered.  After an END record, we enter undocumented terra incognita, in which we can only rely upon whatever reverse engineering we can apply.

Consider this sample HAL/S source code for a PROGRAM based on one found in the book Programming in HAL/S:

  SIMPLE: PROGRAM;
DECLARE PI CONSTANT (3.14159266);
DECLARE R SCALAR;
WRITE(6) 'Input R to compute PI R**2, or -1 to quit.';
DO WHILE TRUE;
READ(5) R;
IF R < 0 THEN EXIT;
WRITE(6) 'R =', R, 'and PI R**2 =', PI R**2;
END;
CLOSE SIMPLE;

After compiling this file and parsing the object file, we find extraneous material following the END record.  If we pretend that this extraneous material consists of EBCDIC characters — which is only partially true —, it could be translated into ASCII essentially as follows:

" STACK $0SIMPLE"
"cSYM a aaaSTART I**20018"
"cESD a abSTART aaaa aa $0SIMPLEcaaa I**20019"
"cTXT aaa a abU3aa I**20020"
"cRLD a acabaaac I**20021"
"cEND aaa ab 2HAL/SREL3 V0 24329RSB-XCOM-I000924239 I**20022"

Aside:  Strictly speaking, this isn't an ASCII translation of the extraneous data, because it has a few flourishes of my own to make it more readable.  Specifically, the enclosing quotes are mine, the non-printable control characters encoded numerically as 0x00 (NUL), 0x01 (SOH), and 0x02 (STX) are represented by the lower-case characters "a", "b", and "c", respectively, and there aren't actually any newlines.  But we can get the gist of it!

We need to digress for a moment before trying to make some sense of this extraneous nonsense.  In the earlier discussion of the object-file format, I glossed over certain simple facts that I assumed you could glean from the System/360 documentation if you were actually interested.  But those facts become relevant to the discussion now, so let state them:

So superficially at least, the extraneous material at the end of the object file starts with an aberrant record that is neither 80 bytes in length nor begins with 0x02.  But after that aberrant record, the remaining extraneous material seems to follow the rules as if they comprised a completely separate object file.

Taking all of that into account, here's the kind of reverse-engineering report we actually see.  The extraneous material is highlighted in green, and I've invented a new record type (HDR, for "header") for what I've so far been calling the "aberrant" record:

0000: 	type=SYM ident="I**20001" size=002E
0050: type=SYM ident="I**20002" size=002E
00A0: type=SYM ident="I**20003" size=0029
00F0: type=ESD ident="I**20004" size=0030 esdid=0001
symbol1: name="$0SIMPLE" type=SD address=000000 length=0060 AMODE24 RMODE24 RW
symbol2: name="#ESIMPLE" type=SD address=000000 length=000C AMODE24 RMODE24 RW
symbol3: name="#DSIMPLE" type=SD address=000000 length=0052 AMODE24 RMODE24 RW
0140: type=ESD ident="I**20005" size=0030 esdid=0004
symbol1: name="@0SIMPLE" type=ER
symbol2: name="#QIOINIT" type=ER
symbol3: name="#QCOUT " type=ER
0190: type=ESD ident="I**20006" size=0020 esdid=0007
symbol1: name="#QEIN " type=ER
symbol2: name="#QEOUT " type=ER
01E0: type=TXT ident="I**20007" offset=000000 size=000C esdid=0002
data: 00 00 00 00 00 00 07 00 00 00 00 05
0230: type=TXT ident="I**20008" offset=000008 size=0004 esdid=0003
data: 00 01 00 12
0280: type=TXT ident="I**20009" offset=000000 size=0038 esdid=0001
data: E8 F3 00 00 E9 F3 00 00 B9 14 E0 FB 00 14 EB 11
BB 24 BE E8 BD E5 D0 FF 38 00 EA 4D D0 FF 38 00
BE E7 75 E5 D0 FF 38 00 EA 0B D0 FF 38 00 78 07
DD 04 DF 50 BE E8 BD E5
02D0: type=TXT ident="I**20010" offset=000038 size=0028 esdid=0001
data: D0 FF 38 00 EA 41 D0 FF 38 00 78 07 D0 FF 38 00
EA 21 D0 FF 38 00 7F 07 67 E7 67 05 78 E7 D0 FF
38 00 DF 7A C9 F9 00 00
0320: type=TXT ident="I**20011" offset=000000 size=0002 esdid=0003
data: 00 15
0370: type=TXT ident="I**20012" offset=000004 size=0004 esdid=0003
data: 41 32 43 F6
03C0: type=TXT ident="I**20013" offset=000010 size=0036 esdid=0003
data: 00 0D 61 6E 64 20 50 49 20 52 2A 2A 32 20 3D 00
00 03 52 20 3D 00 00 2A 49 6E 70 75 74 20 52 20
74 6F 20 63 6F 6D 70 75 74 65 20 50 49 20 52 2A
2A 32 2C 20 6F 72
0410: type=TXT ident="I**20014" offset=000046 size=000C esdid=0003
data: 20 2D 31 20 74 6F 20 71 75 69 74 2E
0460: type=RLD ident="I**20015" size=0038
relocation=0004 position=0001 flags=(0,0,A,1,0,0) address=000002
relocation=0003 position=0001 flags=(0,0,A,1,0,0) address=000006
relocation=0005 position=0001 flags=(0,0,A,1,0,0) address=000018
relocation=0006 position=0001 flags=(0,0,A,1,0,0) address=00001E
relocation=0005 position=0001 flags=(0,0,A,1,0,0) address=000026
relocation=0007 position=0001 flags=(0,0,A,1,0,0) address=00002C
relocation=0005 position=0001 flags=(0,0,A,1,0,0) address=00003A
04B0: type=RLD ident="I**20016" size=0030
relocation=0006 position=0001 flags=(0,0,A,1,0,0) address=000040
relocation=0008 position=0001 flags=(0,0,A,1,0,0) address=000046
relocation=0006 position=0001 flags=(0,0,A,1,0,0) address=00004C
relocation=0008 position=0001 flags=(0,0,A,1,0,0) address=000058
relocation=0001 position=0002 flags=(0,0,V,1,0,0) address=000004
relocation=0003 position=0002 flags=(0,1,A,1,0,0) address=000004
0500: type=END ident="I**20017" idrType="2"
translator="HAL/SREL3 V0 24331"
processor="RSB-XCOM-I000924239"
0550: type=HDR ident=" " length=15
text=" STACK $0SIMPLE"
055F: type=SYM ident="I**20018" size=0009
05AF: type=ESD ident="I**20019" size=0020 esdid=0001
symbol1: name="START " type=SD address=000000 length=0004 AMODE24 RMODE24 RW
symbol2: name="$0SIMPLE" type=ER
05FF: type=TXT ident="I**20020" offset=000000 size=0004 esdid=0001
data: E4 F3 00 00
064F: type=RLD ident="I**20021" size=0008
relocation=0002 position=0001 flags=(0,0,A,1,0,0) address=000002
069F: type=END ident="I**20022" entryAddress=000000 esdid=0001 idrType="2"
translator="HAL/SREL3 V0 24331"
processor="RSB-XCOM-I000924239"

--------------------------------------------------------------------------------
SYM-Record Summary:
CONTROL offset=000000 name="$0SIMPLE"
DUMMY offset=000000 name="STACK"
DATA offset=000028 name="STACKEND" datatype=H
DUMMY offset=000000 name="HALS/FC"
DUMMY offset=000000 name="HALS/END"
CONTROL offset=000000 name="$0SIMPLE"
INSTRUCTION offset=000002 name="D24331"
INSTRUCTION offset=000002 name="T2904859"
CONTROL offset=000000 name="#ESIMPLE"
DATA offset=000000 datatype=Z
CONTROL offset=000000 name="#DSIMPLE"
DATA offset=000000 datatype=Z
DATA offset=00000C name="R" datatype=E
CONTROL offset=000000 name="START"



This page is available under the Creative Commons No Rights Reserved License
Last modified by Ronald Burkey on 2024-11-29

Virtual AGC is
              hosted by ibiblio.org