Saturday, June 18, 2022

Continuing the load of the CPU Test diagnostics into the 1130 core memory and ran them, not successfully


I made some improvements to the Memory Load tool which now loads each 1K words in just under 6 minutes. I expect that a full memory load (8K words) would take 47 minutes to complete. 


To my delight the CPU Test diagnostic had a footprint of only 4K words. This makes sense because IBM did sell a 4K low end version of the machine. Thus the load process was faster than I had anticipated. 


The documentation, as well as the behavior on the IBM 1130 Simulator, is that the program would run for a couple of minutes and then stop with a wait instruction 3003 indicating successful completion of all tests. 


When I began the test, it ran but continued to run long past the two minute point where it should have stopped. A bit later, it stopped with a Parity Stop, meaning that we had a parity error in core. It was the same symptoms I had seen before, the high bit (0) turned on when the parity value indicates that it should have been a zero. 

Red Parity Stop lamp on left side is lit

Since I had the listing for the code that was running at the time, I could see that it was loading a value of 0005 from a memory location but the value in the Accumulator was 8005 because of the high bit flip. I immediately ran a Storage Display where the hardware cycles around through all memory locations reading the contents of each word - with no parity error indicated. 

Executing Store long format, fetching word 2 of the instruction

This suggests to me that some process is flipping bit 0 to a 1 on a read but not actually flipping the core. It could be an out of adjustment sense amplifier or it could be some errant logic elsewhere that is ORed to set the flipflop for bit 0. 

Further, the code that is executing is the code that would be invoked if I had requested looping on an error condition, but I had set all the CES switches to zero thus asking for a single pass. In order to get to that code, something had gone awry in the execution of the diagnostic, but I don't know where or even when it happened. 

I may have to patch in some stops into the diagnostic so that I can find where it reaches. If I know that it has successfully tested some percentage of the instructions, I can at least consider them to be fully operational. Further, I could do some binary search to home in on where the divergence begins and get a clue about the defect causing it. 

I may also have to troubleshoot the bit flip parity problem, which does not occur with continual Storage Display access but does with some loops. I will build some loops and set them running to see if I can force the failure. It may allow me to record enough information when the parity error is detected to find the culprit 

No comments:

Post a Comment