1401 System
Our team arrived at our hotel late on June 6th, worked on the 7th, 8th and 9th, with travel home on the 10th. There were tours, picnics, interviews and other events that took time, but we did get a decent amount of time working on their equipment.
The 1401 system had been previously powered up by the local team, but it was not able to do arithmetic correctly. When we arrived we started to work on that problem. Other problems arose that had to be dealt with, such as when we lost the ability to store the A bit in any position in memory.
The A bit problem manifested itself as a C bit (checksum) error, which we began tracing through the C bit logic until we realized that the machine was also not holding the A bit, whose absence made the C bit value incorrect.
We found a total of three cards that were malfunctioning, replaced them and had data storing properly again. We went back to work on the addition failure. The machine could correctly add 1 + 2, for example, but not 2 + 2.
We quickly realized that we had a 'hot' 1 bit, where any arithmetic result would have the 1 bit turned on regardless of its proper value. Thus, 1 + 2 produced 3, but 2 + 2 produced a 5 since the 1 bit was erroneously set.
We were tracing this from the adder logic itself out to the memory. The way that arithmetic works in a 1401 is that the result character of an addition (or other arithmetic operation) is stored in memory without going into the B or A register. Thus, along with the wrong value, if the 1 bit was not intended to be on, the parity would also fail. The 2 + 2 case stored a 5 (1 and 4 bit) without the C bit since parity should be odd, flagging an error due to an even parity.
The 1401 uses wired-OR logic, where multiple gates have their outputs shorted together to form an OR of the conditions of the contributing gates. This means when you have the extra 1 bit set, it could come from any of several gates that are shorted together.
We did lots of oscilloscope work probing the state of various signals in the path from the adder to where it stores in memory. For quite a while, we saw that no set of inputs should produce a 1 output yet it was there.
To do the scoping, we set up a short loop to set up fields for an addition, perform it and loop perpetually. We had the most success triggering the scope by a signal that is activated when the adder is ready to store its result in memory.
The 1401 system encodes numbers as binary coded decimal (BCD) characters, but the arithmetic hardware itself uses a system called qui-binary by IBM. Thus, the input digits are converted from BCD to qui-binary, arithmetic occurs and the output digit is converted back to BCD.
Qui-binary has a five value and a two value section, the quinary (base 5) and binary (base 2) portions. Thus, we had to find the circuitry that assembled the BCD bits from the quinary and binary states. We looked at the first gate generating the 1 bit and found that the adder was giving the proper value. 2 + 2 had only the 4 bit set, not the 1 bit.
The 1 bit value then transitioned through a small number of gates until it reached a double negative AND gate whose two sections were ORed together and also wire ORed to several other gate outputs. This wire OR output is the drive for whether a 1 or 0 is written in the 1 bit during the current memory cycle.
The top of our double AND gate had the 1 bit value from the adder and the overall signal to write an arithmetic result to memory. The bottom had the value of the 1 toggle switch on the console and the overall toggle switch to manually enter data into memory. Thus, this double gate drives a 1 either because of manual entry or arithmetic results.
The inputs to the manual entry section don't change unless the toggle switches are moved. The inputs to the arithmetic result section were 1 for the 1 bit value and a pulse to store. Since this is a negative AND gate, it only passes a result if both inputs are negative. It therefore should NOT write a 1 into memory.
The wired OR output of this and the other gates showed a positive pulse, writing a 1, at exactly the timing and shape of the enabling pulse for arithmetic result storing. Inputs don't meet the conditions of an AND but the output pulses.
Swapped the card but no change. Examined inputs to all the other gates wired into this output, but none had conditions that would fire. Swapped each of the other cards just in case, but no change. Looked at the wiring on the backplane near the card. Tested the signals on the card itself, with an extender, to see if there is a socket problem.
After half an hour of increasingly fanciful hypotheses and tests, looking for some analog issue or hidden path to drive the erroneous 1 output, the problem went away. It was the end of a workday and inexplicably the addition was no longer producing a hot 1 bit in the result.
We could tell instantly because my looping program encounters the parity error when the hot 1 overrides the intended 0 value for that bit. This shows up as a red light in the storage block on the console panel. When that stopped lighting we checked the stored field and found that 2 + 2 was now 4, not 5.
We came back the next morning, and extended my program to add multidigit fields, rather than a single digit for each operand. The red light flashed again while the program looped. A look at the result field showed that our problem had simply changed from a hot 1 bit to a dead 1 bit - always a value of 0.
Thus, 2 + 2 properly produced 4 but 1 + 2 produced only 2, not three because the 1 bit was permanently set to 0. The scope went back on and we began tracing signals again. At this point, I noticed the the input to our double AND gate, arithmetic results section, was at ground potential. Since this is a T level logic signal, the only valid values are -6V and +6V.
I looked at the ALD page and saw that our input to the double AND comes from another logic compartment. The signal moved over our backplane to a paddle card that would route the signal to the other compartment. I checked continuity with a meter to the paddle card.
Since continuity was good on the original compartment (01A3) we moved to the arithmetic unit compartment (01B3) and verified continuity over the cabling between compartments. In fact, we traced it all the way to the output pin of the card that produces the arithmetic 1 bit value.
The output of the card was at ground (invalid level) but the input to gate was valid and correct - either a 1 or a 0 depending on the arithmetic result. We swapped that card with a spare and resolved the problem. Apparently this card was producing the hot 1 bit through some weird failure mode and got worse suddenly yielding the permanent 0 value for bit 1.
We proceeded to check out many variants of arithmetic - different length fields, carries, and subtraction for example. After this proved arithmetic is good, we went on to check other instructions. Among the instructions tested successfully were:
- Move
- Compare
- Branch
- Branch when Equal
- Add
- Subtract
- Set Word Mark
- Clear Word Mark
- Move zone
- Move digit
- Zero and Add
- Read a card
As far as we can tell without running the complete and comprehensive diagnostic tape, the 1401 is fully operational.
1311 Disk Drive
The 1440 system came with a 1311 disk drive that so far was only able to spin the platters. The arm could be manually pushed out over the disk surface but the heads never loaded (lowered to fly on the surface). Iggy worked on this, beginning with a careful inspection and full cleaning of the disk heads and disk pack.
He discovered a misadjusted microswitch, several missing logic cards and a few other things over the course of the three days. After one day, the drive would sequence up to the point that it moved the arms all the way to the inner cylinder, but was not jumping back to the outer cylinder and loading.
By the time we left, the drive completed its sequence, loaded the heads and was fully operational as far as we could tell with the limited testing we completed.
729 Tape Drive
Iggy pulled out one of the tape drives to work on. He found a failed microswitch that kept the vacuum pump from operating, a few other problems and then had the motor that lowers the head onto the tape fail to spin. He determined that the motor itself works but the relay to control it is not operating properly. Since he didn't have documentation for the drive he couldn't finish getting it working.
1402 Card Reader/Punch
The local team were concerned because they had found fragments of rubber belts in the bottom of the machine, but had no spares to install. Frank examined it carefully and found that the only two belts which were missing were both for the punch side. One is critical, as it drives contact breakers in time with the feeding process, but the other is only needed to move the stacker rollers for punch output. As long as one can accept that all punched cards will fall in one stacker, it isn't needed.
We were able to trigger a read reliably by issuing the appropriate 1401 instruction (op code 1) although the data may not be scanned in properly due to a premature reader stop. One cause of this is that the alignment pins to hold down the first reader block were sticking, thus not holding the brushes fully in place.
Frank was able to rebuild the alignment pin mechanism. The brushes in the 1402 are kind of scraggly, so we will send some spare brushes to this museum after we return home. Another problem was that doing a non-process runout (NPRO) operation didn't reliably trigger the read clutch, which we attribute to a problem with the relay logic that drives the 1402.
The machine has many relays which sequence through operations such as reading, NPRO, punching and handle conditions like the hopper emptying. The contacts tend to oxidize over time if not used. We couldn't look at the suspect relays because we didn't have the documentation to tell us which relays were involved. We will send the museum a relay tester that has helped us find and fix bad relays for our 1402s.
No comments:
Post a Comment