Monday, June 17, 2019

Erasable (core memory RAM) of Apollo Guidance Computer repaired and working properly

New flaw in Current Switch module

After having replaced the two failed diodes in this module, we began testing it with pulses to ensure that it worked properly at the currents and timing needed to drive core memory. This module consists of a number of large ferrite cores with four sets of windings on each. One winding is tied to all the cores and is used to switch them all back to the 'off' magnetic orientation. A second winding is used to select a core by switching it to the 'on' orientation.

Certain cores are selected by flipping them on. This flip of magnetic state induces a pulse in the third wiring, which causes a transistor to conduct to drive current through an X or Y addressing line in the B12 core memory module. Later, when all the cores are reset by the first winding, only the ones that had been selected will flip back. This induces a pulse in the fourth winding, turning on a transistor to drive current in the opposite direction through the X or Y addressing line.

This scheme is clever because the core retains the selected address from when it is selected. This selection induced the pulse to read out a word of the B12 module but also remembers which address lines were selected. The AGC logic does not have to retain the address in a register until later in the memory cycle when the word of erasable memory is rewritten - the selection cores themselves hold that information.

One one of the cores, we had a 4 ohm short in some module I/O pins that shouldn't be connected at all. After excavating the potting material around the suspicious area, we quickly excluded simple to repair causes such as shorting wires, shorted interconnect, etc. We found that the first and fourth windings were somehow shorting inside the core - these are the reset winding and the winding that produces a pulse when a previously selected core is reset.
Shorted windings in B11 Current Switch Module circuit
Careful examination showed us that it would be impossible to extract the core/windings from inside the cordwood module where they were epoxied in place. At least, impossible to remove it without severe damage to that core and its windings.

We have a suitable replacement core but putting four windings of 50, 32, 32 and 20 turns would be extremely challenging given the small diameter of the core. Fortunately, Marc and Mike figured out a clever hack to give us an equivalent functionality. It began by completely disconnecting the fourth winding, the one that switches on a transistor when a selected core is reset. That effectively cured the short since the shorted winding was no longer connected to anything.

Then, we installed a transistor of opposite polarity tied to the third winding, the one that is normally used only while the core is selected. When the selection pulse flips the core, the third winding sees a positive direction pulse, causing the attached NPN transistor to conduct. When the core is later reset, the pulse is in the negative direction thus the transistor doesn't turn on.

Yet, if we hook that third winding to a PNP transistor as well as to its NPN transistor, the pulses from the windings will cause one transistor to turn on with a positive going pulse, and the other transistor to turn on with the negative going pulse. This produces the same behavior as the original circuit, but requires only three windings instead of four. That is fortunate because we don't have four useful windings.

We inserted a small PNP transistor and did some rewiring, which gave us a module that passed all tests with flying colors. We were then ready to install the erasable memory driver modules, this current switch module, and our erasable memory (core stack) into the AGC.

PNP transistor wired to third winding of circuit
Archiving prior contents of the B12 erasable memory

We still have the flaw in the core memory stack - the inhibit wire for one of the data bits (bit 16) is an open circuit. In the read portion of a memory cycle, the selected X and Y address lines flip all cores of that word to zero. Any of the bits in that word which had a 1 stored in it will flip, causing a pulse to be detected by the sense amplifiers. This is how core memory does a read, by destructively flipping the word to zero.

During the reset pulse from the current switch module core circuits, the cores for the selected X and Y address lines, those whose current switch cores had been set on, will flip all the bits of the word to a 1 state. However, we don't want them to be 1, as some should be 0. That is the purpose of the inhibit wire - a signal on the inhibit wire will block the reset from flipping that particular bit to 1.

Reading data involves erasing it first, then rewriting the original value (or putting a new value in if this is a write operation). With a bad inhibit wire, bit 16 will be rewritten (or written) to 1 regardless of our desired value. Thus the loss of the inhibit wire renders that bit useless throughout all 2K words of erasable memory.

If the inhibit wire damaged happened after the computer was last powered down, then it hadn't yet forced a 1 into every bit 16 in the core stack. We had the opportunity to read the contents of the core memory correctly, since the read portion of a memory access doesn't use the inhibit wire. However, this is a one shot opportunity, as the process of reading any word will jam in the 1 in bit 16 during rewrite.

We powered up with memory access blocked through the test connector, put the machine in single instruction mode, and tested by reading a few memory addresses. We did find that bit 16 sometimes had the value 0, sometimes 1, verifying that the damage to the module was not present when it was being accessed 40-50 years ago.

We wrote down the values we saw in the test locations, then used Mike's test monitor to run through all of memory and retrieve the values. Folding in the few manual reads gave us a complete file of the prior contents of our memory module B12.

Mike studied the contents and could confirm that this was running a version of Aurora software, similar but not identical to the release for which there is an archived image. He could decode the contents of the DSKY display. It showed that the last command entered was a coarse alignment of the IMU (gyroscopes and accelerometers) to angles of 0, 0, and 90. The Gimbal Lock warning light was on, which may have been the event that caused the operator to perform a coarse alignment. Finally, the code stores a latitude when aligning the gyros - it was the location of the Johnson Spacecraft Center in Houston.

Rewiring to swap parity and data bits in B12 core memory module

Now that the prior contents are safely archived, we can implement our modification to make the core erasable memory functional again. Our great fortune is that the erasable memory module contains an error detection method called parity, to capture the cases where some radiation event might have flipped a data bit randomly.

This works by adding a sixteenth bit to each word - data bits are 0 to 14 plus 16, with bit 15 representing parity. The rule for parity is that the number of bits in a word which are 1 has to be an odd quantity. If not, the parity bit is set to 1 thus making the entire word have an odd count. if the data bits themselves have an odd count of 1s, the parity bit is set to 0.

At the end of a read from a word of core memory, the processor counts the 1 bits, determines if parity should have been 1 or 0 and then compares it to the parity bit read from memory. Mismatches raise an erasable memory parity alarm.

The erasable memory is quite reliable, especially since our AGC is not out in space subject to radiation events. We therefore don't really need parity checking. That gives us a working bit 15 which we can substitute for the broken data bit 16. As long as we can disable the parity checking, blocking the alarm, we will have a working memory.

Some backplane wiring was introduced to swap bits 15 and 16, as well as block the signal that performs a parity check. With the wirewrap changes completed, we closed up the AGC and tested. Success! The erasable memory is fully functional and we can run software fully out of the repaired erasable memory and the core rope simulator boxes providing the fixed (core rope) read only memory, just as the machine

1 comment:

  1. This is such a huge milestone on this project! Well done. I continue to follow the project with interest.

    ReplyDelete