Virtual AGC Colossus Page

Source Code and Binary

You can obtain all of the software mentioned above as being in our collection by installing Virtual AGC on your computer. The files are contained within a subdirectory named after the software version (such as "Colossus249" or "Artemis072"). The more important files supplied are these:

Filename.agc	Source code for major subdivisions of the Colossus program.
MAIN.agc	Organizer which treats all of the other assembly-language files (*.agc) as include-files, to form the complete program.
Filename.binsource	Human-readable form of the binary executable as an octal listing.
Filename.bin	Binary executable created from binsource (octal listing) file.

In other words, to create a Colossus binary executable rather than using the one provided with yaAGC (such as Colossus249.bin), one simply needs to assemble the file MAIN.agc. Typically, if all files remain organized the way they are in the yaAGC distribution tarball, the sequence of steps for doing so (from a command-line prompt) would be something like this:

cd Colossus249
../yaYUL/yaYUL MAIN.agc >Colossus249.lst

The binary so produced, MAIN.agc.bin, should be byte-for-byte identical to the binary (Colossus249.bin) provided with the yaAGC distribution. Therefore, the following Linux command should reveal no differences between the two

diff MAIN.agc.bin Colossus249.bin

(Replace "diff" with "fc /b" in Windows.)

Technically speaking....

A point which may not be completely appreciated is that Colossus249.bin was not created from the assembly-language source files. Therefore, the byte-for-byte equivalence mentioned above actually has some significance. In fact, both the assembly-language source code and Colossus249.bin (or Colossus249.binsource) come from separate readings of the original Colossus assembly listing scan, so their equivalence provides an important check on validity. (See below.) The file Colossus249.bin was created from the human-readable/editable ASCII file Colossus249.binsource by means of the program Oct2Bin, with the following steps:

cd Colossus249
../Luminary131/Oct2Bin <Colossus249.binsource
mv Oct2Bin.bin Colossus249.bin

Admittedly, few people are likely to perform any processing of this kind unless contributing a new version of the Colossus code to the Virtual AGC project.

Validation

I generally put a lot of effort into insuring that the executable binaries being run in the simulated AGC are byte-for-byte (or in the case of the AGC, word-for-word) identical to the original binaries used in the spacecraft. Most people won't be interested in this validation effort at all—just as we're not usually interested in how food gets from plants and animals onto the plates in front of us as long as we get our dinners on time; we're happy to trust others to do the work for us—but for those few of you who wonder if the simulated AGC really is running the correct code, the following sections provide explanations. The different program versions are presented in the order in which they were worked on.

Validity of the Colossus 249 Code (Apollo 9)

The following description of validation was written prior to having the Colossus249 software source code in machine-readable form, and therefore prior to being able to the core-rope image against the source code as described above. Not only that, but there were seven errors in the core-rope not detected by the procedure described below!

So the description is now obsolete. But I like it, and it's my website, so it stays. (Actually, it continues to have value, since we don't yet have all versions of the AGC source, and since it describes how to proceed in the case of recovering poor-quality assembly listings.)

A Tour of the Source Materials

Let's look at a pages from the source materials (actually, a greatly improved version) I've used in reconstructing Colossus 1A (build 249). In appearance, what we have is a scan of a photocopy of an assembly listing of the program. Significant portions of the assembly listing have been corrupted (or even destroyed) after scanning, seemingly by processing with the optical-character-recognition capabilities of Adobe Acrobat. (The source materials used for Luminary 1C, build 131, are generally similar, except that the original photocopy seems to be of better quality, the scans seem to have been performed with much more care, and no destructive post-processing has been performed. On the other hand, the original Luminary printout seems to have been on computer paper with green and white color bars, and the remnants of the green bars make portions of the listing very difficult to read with certainty. Most of what I have to say will apply to Luminary as well as to Colossus.) With some luck, I'll be able to find (or will be given) better source materials at some point in the future, but I've had no luck so far in obtaining such materials and therefore have had to make do with what's available.

An assembly listing is a type of report created when a program's source code is processed ("assembled"), in this case by the assembler program called YUL. (Actually, by GAP, the successor to YUL, but when I say "YUL" I really mean "either YUL or GAP".) Such an assembly listing contains not only a listing of the source code, but also other helpful information such as the octal values which the source code assembles to. The octal codes are what is needed to run the program, while the source code is what is needed to modify the program. The general layout of the Colossus (or Luminary) assembly listing has a series of sections, of which some of the more interesting are:

A listing of the source code, side by side with the octal codes created from the source code, side by side with the addresses at which the octal values are placed. Click here to see a typical page (without OCR). The source code is on the right-hand side of the page, whereas the octal values, addresses, and other miscellaneous notations are on the left-hand side of the page. This section occupies about 1500 pages of the assembly listing.
A symbol table, giving the addresses at which all of the symbols created by the program are defined.
An octal listing, which is the same as the octal values shown earlier side by side with the source code, but now rearranged so that the values are arranged by memory address. Click here to see a typical page (without OCR). This section occupies about 150 pages of the source listing.
An octal-to-page cross reference, in which you can find the page of the assembly listing at which the octal value for any given address is defined.

Proofing in the Usual Case

From the description above, you will have noted that source code is listed once within the assembly listing, but that octal values created from the source code are listed twice, and are therefore redundant. Furthermore, knowing the source code, one can recreate the octal values from it. (The reverse is not as easy.) Finally, the octal listing is much shorter than the source-code listing. Therefore, it shouldn't be surprising that it is much easier to create a file containing a valid octal listing of the program than it is to create a file containing a valid source-code listing.

It is also much easier to know that an octal listing is valid than it is to know that a source-code listing is valid. This is true because the YUL assembler has added "checksums" to the octal values, which can be used to verify (or at least give a high degree of confidence) that the octal values are correct. The way this works is that the so-called "fixed memory" of the AGC has been divided into 36 banks, numbered 0-43 in octal notation, each of which has a 15-bit checksum created by adding up all of the 15-bit words within the bank. (Actually, the rule for computing the checksum is more complicated, but that's the basic idea.) Therefore, if we compute the checksum of a bank and see that it is the same as the checksum provided by YUL, we can be fairly confident that the contents of the bank are correct. (I should mention that I've provided a utility called Oct2Bin which is helpful in this connection. Octal codes can be placed in a simple ASCII file, and then processed with Oct2Bin to create a file containing the actual octal/binary codes and to verify all bank checksums.)

Another helpful touch added by YUL is that each octal value is accompanied by a parity bit, contrived to make the number of 1-bits within the octal value (including the parity bit) an odd number. This is useful in detecting single digit errors of some kinds. (The digits 0, 3, 5, and 6 have even parity, while the digits 1, 2, 4, and 7 have odd parity. Therefore, examination of the parity bit could detect substitution of a '2' for a '3', for example, but not substitution of a '5' for a '6'.)

In almost all cases, the following simple procedure suffices to produce a valid octal listing:

Manually proof-read a memory bank of the octal listing against the original assembly listing, using the redundancy of the octal values or manually assembling source code as necessary to account for corrupted areas of the assembly listing.
Compute and verify the checksum.
On failure, go back to step 1. On success do the next memory bank.

Given a valid octal listing, determining the validity of a source-code listing is rather trivial: simply run the source code through the yaYUL assembler, and verify that the same octal listing is produced.

This procedure in fact suffices for the entire Luminary octal listing, and for all memory banks of Colossus except banks 35 and 36 (octal). If you want to learn more about proofing Colossus memory banks 35 and 36, read onward. Otherwise, quit now!

Postscript: After writing all of the material below, I was fortunate enough to receive some much-improved page scans of relevant portions of bank 36 from Mr. Gary Neff. Therefore, bank 36 is now capable of being proofed by the simple method explained above. However, it was previously proofed using the much more complex methods explained below, and therefore I haven't bothered to change the explanation that follows.

Proofing in Extraordinary Cases

The simple proofing method outlined above fails when the scan of the assembly listing for some memory bank is so corrupted that the bank checksum or some value(s) within the memory bank are completely unknown. Sadly, this condition occurs within Colossus 1C (build 249) banks 35 and 36. Click here to see the culprit page of the octal listing. This corrupted page from the octal listing actually wads together the end of bank 35 and the beginning of bank 36 in such a way that there is no chance of figuring out any but a small part of them. In this case, one has no choice but to use the redundant listing of the octal values that appears side by side with the source code.

Unfortunately, though, the redundant octal values in the source code listing fail us here, though in different ways for bank 35 and bank 36. For bank 35, we encounter the problem that the bank checksum appears only within the octal listing and is not redundant. Therefore, we can theoretically know all of the octal values in the memory bank—if our proofing is 100% accurate—but cannot use the bank checksum to give us the degree of confidence we would like. For bank 36, the checksum is intact but some of the octal values cannot be directly determined because of corruption within the source-code listing. (In other words, both the octal listing and the source-code listing are corrupted in the corresponding place.) As far as proofing is concerned, however, the conclusion is the same: We need some method of proofing which gives us a high level of confidence in the result, in the absence of a checksum.

Here is the alternate method of proofing used in this cases, which seems to offer a fairly high level of confidence:

Note first that each memory bank actually occupies 4 pages within the octal listing. For this method, we proof each memory page separately from the others.
Manually proof-read the memory page, using either the octal listing or the source-code listing as desired or appropriate. While doing so, record the number of digits which have been corrected.
Repeat step 2 until zero corrections have been made in two successive proofings.
To the extent feasible, one of the 0-error proofings should use the octal listing, and the other should use the source-code listing.

The theory behind this procedure is that in any given proofing a certain percentage of errors is corrected, and that by monitoring the number of errors that have been corrected we can get a good sense of the number of errors remaining. To take a "typical" case, suppose that on the first proofing 100 errors are detected and corrected, on the second proofing 1 error is detected and corrected, and on the third and fourth proofings 0 errors are detected. It would be reasonable to suppose that each proofing pass detects 99% of errors, so the expected number of errors remaining after the 4th pass is 100*0.01*0.01*0.01*0.01, or about a millionth of an error. Of course, this reasoning assumes that our ability to detect and correct different errors has some quality of statistical independence, and that errors are not remaining undetected through some systematic problem. We try to overcome that objection by making sure that we use both the octal listing and the source-code listing in the proofing process, but how successful we have been is always debatable.

As far as proofing is concerned, this technique suffices for Colossus banks 35 and 36, but does not actually give us a 100% knowledge of the octal values within the banks. For bank 35, we can simply construct a checksum and be done with it. But for bank 36, we need to do a little more work. If you want to know more about that, read onward!

When All Else Fails—Digital Archaeology

In a case like bank 36, the technique above may give us confidence that all octal values that are legible in the page images are keyed accurately, but the proofing technique can't give us any insight into the octal values that are simply missing from both the source-code listing and the octal listing. One trick we can try is to compare the Colossus and Luminary source-code, which actually have a high degree of overlap in general. Unfortunately, this is not a case in which there is any such overlap, so the comparison is of no help.

Click here (p. 846) and here (p. 847) to see the pages of the source-code listing that are relevant to the problems in bank 36, namely:

Address 36,2634: This source-code line is simply missing. We have no clue as to what may have been in it, except that it is seemingly the first part of a double-precision constant whose second part is the word 00000 octal.
Address 36,2734: Appears to be interpretive code reading "something1 something2", but only a couple of pixel rows are present, and so neither the octal code nor the assembly-language code can be read.
Address 36,2742: Highly corrupted, but may have the octal code 00021 and may have interpretive source code reading "16D".
Address 36,2747: Pretty corrupted, but seems likely to be the interpretive source code "CALL" with octal code 77624.

As far as points 3 and 4 are concerned, I think it's pretty easy to convince ourselves that the proposed interpretations are probably correct. If you want a challenge, you may like to try solving problems 1 and 2 for yourself, before continuing to read my proposed solution below.

Point 2 is somewhat trickier than problems 3 and 4, because the corruption of the page-image is so much greater. But with some effort, we can still figure it out. Within the offending page of the source-code listing, notice the notations "REF" and "LAST". These notations appear only on source-code lines referencing variables. "REF" gives the number of times the variable has been referred to so far in the assembly listing, which "LAST" gives the page number on which the preceeding reference occurred. The problematic source line clearly contains REF and LAST notations (even though we cannot read them), and so it is clear that something2 is the name of a variable. You should be able to convince yourself fairly easily that the variable in question is RCON. In other words, something2="RCON". (In case it's still not obvious how you would convince yourself of this, note that "REF 6" to RCON is below the missing source line, while "REF 4" is on the preceding page, but that "REF 5" is nowhere to be found.) Next, because this section of code is AGC interpreter language (rather than assembly language), we can narrow down the choices for something1; theoretically, the only possibilities are STORE, STODL, or STOVL. Simply in terms of appearances, it seems likely that something1=STODL. Unfortunately, the interpreter documentation (Users Guide to the Block II AGC/LGC Interpreter, by Charles A. Muntz) was last updated in early 1965, and contains an incorrect numerical opcode for STODL. Not to be deterred, though, if we look at the other source-code lines near the missing line, we see intact lines like "STODL RPRE," (the comma after "RPRE", by the way, is actually part of the variable name), from which we can deduce that the octal code for STODL is 16000. Noting that RCON has the address 1635, therefore, "STODL RCON" must assemble to the octal code 16000+1635+1=17636. But—and why, I'd like to know, does there always seem to be another "but"?—the missing line of code is preceded by a line of source code reading simply "STADR". The STADR opcode has the odd effect of complementing the code that follows it. Therefore—finally!—the missing octal code for address 36,2734 is probably 60141. This closely matches the remaining pixel fragments (and expected parity bit) of the missing line, and so we can feel good about it.

Having now determined all of the octal values within the bank except at address 36,2634, and noting the checksum for bank 36 is known, we can consider the possibility of recreating the octal value at 36,2634 simply by putting a value there that correctly reproduces the bank's checksum. This is actually trickier than you may suppose, since the checksum is unique only up to a sign. (In other words, we don't know whether to insert a value that produces the positive checksum or one that produces the negative checksum.) Often, either the positive or the negative checksum will be impossible to produce, thus removing the uncertainty. Where it is possible to produce both the positive and the negative checksum, we note that 90% of the memory banks have positive checksums, so we have a much better chance of being correct if we choose the positive checksum. All of this turns out to be an anti-climax, however: Prior to any of the reasoning listed above, our inclination was simply to put 00000 at address 36,2634. Remarkably, this produces the correct bank checksum without any additional work on our part! Perhaps this is the first time in history that a gut reaction has been confirmed by checksum.

Happy Epilogue!

After the all of the intricate reasoning above was worked out—and the description above written—Mr. Gary Neff was able to send me some terrific replacement scans of the problematic assembly-listing pages from bank 36, namely pp. 846-847. If you examine locations 36,2634, 36, 2734, 36,2742, and 36,2747 on the new scan, you'll see that they indeed contain exactly the values deduced above! I still think the exploration I went through before receiving the new scans is a valuable illustration of technique, not to mention being soothing to my ego, so I'll leave it in place for future generations to enjoy. :-)

Validity of the Comanche 055 Code (Apollo 11)

The Comanche 055 page images became available after the Colossus 249 and Luminary 131page images had already been converted to source-code files, and prior to any other missions becoming available. The conversion technique was very abbreviated compared to that of Colossus 249, as follows:

A small corps of volunteers—thanks (in alphabetical order) Fabrizio Bernardini, Hartmuth Gutshe, Onno Hommes, Jim Lawton, and Sergio Navarro!—took the existing Colossus 249 source-code files, or on occasion the Luminary 131 source-code files, and laboriously compared them line-by-line to the Comanche 055 assembly-listing page images, porting the differences they found.
The resulting Comanche 055 source code was assembled using yaYUL to produce a binary executable, which was horribly wrong at this point (with dozens of conversion errors), but was used to create a "binsource" file (which is an octal listing of the entire program).
The binsource file was laboriously proofed against the octal listing in the page images (this time by me), and corrected. As a double-check, the bank checksums are all correct.
Finally, at every point where the binary created from the source code by yaYUL differed from the binsource file, the source code was compared to the page images and corrected. After all corrections were made, the binary created by yaYUL was identical to the binary created from the binsource file.

The binary thus produced by yaYUL is supplied in the source tree and used for regression testing.

Validity of Artemis 072 Code (Apollo 15-17)

The Artemis 072 executable (Artemis072.bin) was created by manually entering binary data from scans of an Artemis 072 assembly listing. The data was proofed and reproofed until the bank checksums, as described above, were correct, and the binary was made available to the public on 2006-01-10. On 2010-02-20 (more than 4 years later), source code was finally available and assembled. A single error which did not affect the memory-bank checksum was found and corrected in the binary. So we are now confident that what is being provided is both complete and correct.

This is an interesting object lesson. My previous notes in this section, prior to the availability of source code, made it clear that I wouldn't be surprised if additional pairs or even triplets of errors were found in the binary ... but the notion that an unpaired error might still exist which did not affect the checksum never occurred to me. The problem is a weakness in the checksumming algorithm used in the AGC.

If the AGC had used a "normal" checksumming algorithm (from the modern point of view) with 2's-complement arithmetic, any single error in data would have caused the checksum to change, and the error to be detected.

But in the 1's-complement arithmetic of the AGC, simple checksumming does not have the same desirable property. For example, since there is both a +0 and a -0, which produce the same result in an addition, a memory error in which +0 is replaced by -0 (or vice versa) produces the same checksum and would be undetectable. But algorithmically the problem is more complicated than that, since a simple checksum algorithm was not used. The actual algorithm used on a data-word by data-word basis was this:

The next data word is added to the running checksum.
If there is positive overflow, the result is reduced by 040000 (considered as a positive number, thus preserving the sign of the result) and then incremented.
if there is negative overflow, the result is incremented by 040000 (again, considered as a positive number, thus preserving the sign of the result), and then decremented.

For example, if the running checksum was 030000 (a positive number), and the next data word was 020000 (another positive number), then adding them would give 050000 (arithmetic overflow!), which would be decremented (due to the overflow) by 040000 to give 010000, and then incremented by 1 to give a running checksum of 010001.

If you are amused by such stuff, I invite you to consider the ways in which such an algorithm might fail to detect error. In the case of the Artemis 72 binary, the now-corrected error was a single bit (the most significant bit) of a single memory location.

Validity of Colossus 237 (Apollo 8)

The Colossus 237 program listing presented a special challenge, because the only physical copy we've had access to is truncated. It contains all of the source code, as well as the binary words associated with the source code on a line-by-line basis. But it is missing the octal listing that condenses all of the binaries into an easily accessible form. Because the octal listing is missing, the "bugger words" that are used to make the memory-bank checksums compute correctly are also missing. Thus, important redundancies and checksums that normally would be used to provide confidence of correctness are missing.

Therefore, the following unique validation procedure was applied instead:

A "binsource" file was created by laboriously stepping through the page images of the program listing, line-by-line, and observing the listed binary words. The "bugger words" are not available in this fashion, and are therefore left at zero.
Since Colossus 237 and Colossus 249 are separated in time by only a few months, the binsource files of Colossus 237 and Colossus 249 — i.e., the binary files we create manually, independently of the source listings, as part of the data-entry process — were compared using automated tools ('diff' or similar). Where differences existed, the page images of the Colossus 237 program listing were referenced to verify (to the extent possible) that the differences were real rather than being transcription errors.
Where complete memory banks were found to be identical by the automated comparison, the bugger words were copied from Colossus 249 to Colossus 237.
Source-code files were created from the page images of the program listing.
The binary of the assembled source-code files was compared in an automated fashion against the binary created from the binsource file, and either the binsource file or the source code (or both) were corrected when any discrepancies were found. At this stage of the process, validation process, it is very typical for "superbank" errors to show up; these are errors not in the source code per se, but are due to discrepancies between yaYUL's assumptions about the superbank memory settings in comparison to the assumptions of the original YUL or GAP assemblers; thus, those superbank problems were corrected as well.
For any banks remaining without bugger words in the binsource file, the bugger words created by yaYUL were added.

Thus while we can't have as much confidence in the validity of the Colossus 237 transcription, in comparison to almost all of the other Colossus and Luminary transcriptions, the code has nevertheless been double-checked both manually (visually) and in an automated fashion as well.

Validity of Solarium 055 (Apollo 6)

Solarium 055 presented a big challenge because it is the first Block I code processed, and no Block I support was previously provided by the yaYUL assembler. However, support for Block 1 was in fact added to yaYUL, and after that validation of Solarium 055 differed little from any other AGC program. Unfortunately there is no way as of yet to run Solarium in yaAGC, so we have not had the additional pleasure (and assurance!) of witnessing it run.

Contents

What is "Colossus"?

It's Always Sunny in Outer Space

Available and Particularly-Desired Command Module Software Versions

AGC Self-Check

Orbital Integration

Other Tests