Thursday, June 29, 2017

Building out 1130 bulb assemblies, begin construction of 1402 relay tester

IBM 1130 LIGHT PANEL UPGRADE

Today I soldered together more lamp assemblies, eight at a time with breaks to minimize frustration and achy backs. By the end of the day the main board had all its 96 light positions installed. Now that the big board is done, there are a mere 60 to build for installation across the two small boards.

RELAY TESTER BUILD FOR TECHWORKS!

I received all the remaining parts today and began layout of the components inside the project box. I marked up the holes to mount the two relay sockets as well. I located my 20V and 5V power bricks which will drive the coil and contact current.

The design uses 1W resistors to control current through all the contacts, developing about 150 ma on each set and measures the voltage drop of the contacts to determine quality. Small Arduino interface relays fire the 20V pick and hold coils on the relays under test.

I will solder the 1W resistors directly onto the relay sockets, so save the space required for a resistor mounting perfboard as used in the original machine. Thus I just need to mount an Arduino, a small relay board, the two relay sockets, and four LEDs. The two power bricks and most of the USB cable will be outside the project box. 

Repaired 1401 system at CHM, worked on 1130 lights and began relay tester for TechWorks!

1401 RESTORATION AT CHM

Today I replaced all the power transistors in the 30V, 7A power supply for the 1406 box that hosts the additional 12K core locations. When they were all in place, I tested my two rebuilt voltage regulator cards. Both worked, allowing me to set the voltage and have it hold steady regardless of load or input voltage.

I moved on to the Over Voltage Protection card, which crowbars across the supply if the voltage exceeds a trigger level just a few volts above the 30-31 V normal level. The card has an SCR which shorts the output through a 4 ohm power resistor when triggered, the resulting 12A or more should trip the circuit breaker within microseconds.

Our card was quite scorched, as were the power resistors, since the circuit breaker had held open for about ten seconds, allowing almost 500W of energy to cause the resistors to glow. As the resistor is a 10W type, able to handle a few microseconds of current but not the prolonged operation during this failure.

We had a failed voltage regulator card, which let the output surge up to 40V, triggering the crowbar which did its job. The resulting 10A flow should have tripped the breaker rapidly but didn't. I found a circuit breaker on a spare power supply of a different voltage type, swapped it in and we put the repaired power supply back in the 1406 box.

The machine came up, we trimmed the voltage to spec and it performed perfectly during the public demonstration today. Pleased to get the Connecticut 1401 system back in operation after a few weeks of downtime.

IBM 1130 LIGHT PANEL UPGRADE

I am up to 24 soldered lamps on their holders, only 72 more to go for the big board. I do batches of 8 at a time, looping the bare wire ends of the lamps around the .100 spaced header pins, soldering, arranging the bulb and wires, testing on my prototype SCR circuit and then installing each into the board socket.

BUILD RELAY TESTER TOOL FOR TECHWORKS! MUSEUM IN BINGHAMTON

The 1402 Card Reader/Card Punch that is the heart of a 1401 system uses dozens of relays to sequence itself through operations such as reading, punching, run-out and initial card feed. Stan Paddock invented an Arduino based tester that drives a relay, measuring the voltage across the contacts as it operates.

We used it to shake the relay while contact cleaner was applied, then test all four or six SPDT contact sets on the relay simultaneously. It helped us identify relays that needed repair, perhaps replacement contacts or just some bending and burnishing. Without the tester it was very hard to keep the 1402s running.

To help TechWorks! get their 1402 operating, I decided to build them another relay tester based on Stan's design. It uses an Arduino, dual relay board, resistors, two DC power supplies and a pair of sockets. I have all the parts on hand, but have not yet picked out the right enclosure for the build.

Once I have the enclosure and create the mounts for the relay sockets and some indicator LEDs, I can wrie up and test the tool before shipping it to Binghamton. 

Wednesday, June 28, 2017

Building up self-testing generator inside disk tool, working on 1130 lights

ALTO DISK TOOL

This morning I wired up the sector number input signals to the spare pins where I was generating sham sector numbers tied to the sham SectorMarks I produce. The remaining logic design task is to produce a steady stream of clock pulses on ReadClock and periodic one bits on ReadData, such that the ReadSector logic can complete (albeit with checksum validity errors on every field).

I introduced the code, a variant of the logic that builds the clock and data pulses for writing. Clocks are steady, 100 ns on every 600 ns. Data bits alternate a 1 every 96 clock times (6 words apart) with 0 for every other clock time. The wiring is set to loop these signals around to the ReadClock and ReadData inputs to the board.

First I tested the WriteEntireCartridge to verify that my sector number generator will allow the mechanism to drive up through all sectors writing from memory. Next, I tested a ReadSector to verify whether the stream of bits coming it would cause it to complete a read, albeit with checksum errors since the incoming bitstream is nonsense.

Results of the test were initially poor. I forgot that the inputs from the Diablo are inverted logic, thus SectorMark is always at 1 except for the brief interval when it turns on with a value of 0. Same with the sector numbering, but I forgot the inversion. Also, the clock and data pulses needed to be inverted.

With the inversion change done and another 30 minutes of toolchain time, I reran the tests. This time, things looked much better. The WriteEntireCartridge function advanced properly through all cylinders, heads and sectors writing different patterns that I believe correspond directly to the image file I downloaded into the FPGA.

The ReadSector completed with checksum errors, as expected and a ReadEntireCartridge transaction ran up through all 202 cylinders reads 12 sectors from each of two heads. Untested as yet is the evidence that the data being read was placed properly in FPGA memory and available for upload, but I should be able to look at an uploaded file and see the pattern of 16 words of zero, one word with x0001 repeated over and over throughout memory.

After lunch, I will think a bit about ways I might inject a more realistic bit pattern including causing a checksum match. This might get complex but it is worth a half hour of thinking. I came up with an array of words, 366 long, which represents all the words to be output during a sector.

Triggered by the gotsector signal that informs a WriteSector or ReadSector transaction that it has matched the desired sector number and is ready to read or write, this steps through 16 bit positions in a word each time I am ready to send the next sham bit cell of 600ns, resetting and bumping the word pointer at the end of the word, then shutting down when the word pointer reaches 365 and the last bit 15 is output.

The remaining task is to preload the 365 words of this array with the proper values. For example, 34 words of zero, a word of x0001 as the sync bit, two words of header data, a checksum value, and ten words of zero as the interrecord gap. I will try to do this with a long string of initialization constants if I can sort out the syntax and stand to type in all those binary word values.

Meanwhile, I checked over my watchdog timer function, which flags if the WriteSector transaction ever experiences a new SectorMark before it is done writing out the three records of the sector. It is working properly, writing every sector of data and stopping the write transaction before the next SM arrives.

With the first version of my sham data emitting logic, I did a ReadSector and looked at the checksum validity bits when it completed.  All bad, furthermore when I looked the first two words were inverted. This was the signal itself to define whether it is a 1 or a 0 bit.

Even with this, the result is checksum errors on all three records and the data in memory is nothing like it should be. Time to look over the logic for ReadSector and ReadField and ensure that it is properly storing the extracted data word. Besides that, I put the scope on the stream being sent in, just in case my sham data logic was malfunctioning.

I can see the proper data bit sequence right on the scope, delivered from my sham data generator to the ReadData line of the disk tool input. The logic is not reading properly yet, but I think I saw a flaw in my sector triggering and did a run to fix that up.

Since the Alto stores data in reverse order (last word of a record, then each lower word until one word comes at the end), setting up my sample sector is a bit messy. I started with a sector where the header, label and data records were all filled with zero words. That made the checksum calculation easy - 0x0151 - and I will get a proper match.

Later I can enter the label field in reverse word order and compute as best I can the checksum value for that field. With those set in the sham sector, I would have a good checksum validation for that sector.

IBM 1130 LIGHT PANEL UPGRADE

I don't have my jig built yet but I did solder up a small number of lights onto headers, using the new tinier bulbs I received from China. I will work on small batches until I have all 160 done, hopefully with a jig to speed things up. 

Tuesday, June 27, 2017

Fixed some defects, cleaned up disk tool

ALTO DISK TOOL

First result from the testing is that I have disabled the mechanism to read and write between the PC and the fpga memory. I stripped out something when I was removing all the combinatorial inputs to the state machines. I have a backup from about a month ago that I could look over and find the missing bits to restore the capability.

I did find what changed, but it was in an area with combinatorial logic creating signals that are inputs to FSMs, which I tried to remove as far as possible by replacing them with clocked processes that produce registered outputs.

A couple of signals weren't easily amenable to registering, without introducing a one cycle delay in responding to conditions, but they were slow changing fields that I thought worth taking a risk with. Meanwhile I will noodle on ways to move all to clocked processes.

The memory read/write logic is not working properly, thus I had to move on to the more challenging approach that removed all combinatorial logic from the mechanism.  This changed the flow enough that testing will have to be focused heavily until we know it works.

By early evening it was writing and reading to memory properly, at least as far as the PC to fpga link is concerned and the contents of the archived image I downloaded to the fpga was reading properly. I then set up the scope to watch it write a sector, to see if it is still doing that properly.

My testbed doesn't provide all 12 sector values back to the logic, thus making it challenging to check out the full WriteEntireCartridge logic. Even more, it can't deliver a realistic input stream of disk data and clock pulses to check out the ReadSector function properly. I believe my ReadSector is hanging waiting for a sync pulse, but I need to route other signals to LEDs in order to test this out.

I now have the eight non-idle states of the ReadSector transaction displayed on my eight board LEDs. I also set up a counter to emit SectorMark signals once every 3.33ms just as the real disk does. This will allow me to try out a WriteSector and observe the pattern on the scope, in addition to determining where the ReadSector is stalling.

My sector signals are working properly and the WriteSector function seems to work perfectly. I still have the watchdog timer switched off - this should fire off if a sector is still being written when the next SectorMark arrives, but I have some flaw that triggers it falsely.

When I attempt a WriteEntire Cartridge, the logic writes the first sector and stalls waiting for a match on sector number 1, since my current testbed is continually reporting sector 0. However, I may be able to loop some more signals to drive the sector numbering appropriately.

I tested the ReadSector function, which should stall on the first field waiting perpetually for a sync bit, but instead it clocks through, issues a checksum error (appropriately for an all 0 field), then waits forever for the sync pulse that starts field 2. I can temporarily fix this up by wrapping signals that issues one bits periodically along with a steady stream of clock pulses.

The big question is why ReadSector thinks it got sync for the first field. Must be an error in the startup of my bit logic.

This will take some additional signal lines, jumper wires and a bit of fpga logic to produce the appropriate signal timings. By the time I retired for the night, the logic for producing 'rotating' sector numbers was complete but not yet wired in. Tomorrow I will need to work on the ReadData and ReadClock simulated signals.

DIGIBARN XEROX ALTO RESTORATION

Marc continued to battle the power supply, discovering a bad regulator and a failed power resistor (so far). I bought the replacement parts and will deliver them to him so he can fight on. We shall see what it takes to get this dual power supply (5V and 5V, interestingly) to work completely. 

Sunday, June 25, 2017

Refactoring and ruggedizing my fpga logic for the disk tool

IBM 1130 LIGHT PANEL UPGRADE

My shipment of 200 miniature incandescent bulbs from China arrived yesterday and are waiting for me to begin soldering them onto 2-pin headers for use in the 1130. I think it best if I make a jig that will hold the header and the bulb in place for quick soldering of the wire leads onto the pins. It will give me consistency of results and some speed in preparing the 160 bulbs needed to populate the light panel. 

ALTO DISK TOOL

My disk tool design is suffering from glitches driving state machines into undefined states. The state machine (FSM) then will stall in that invalid state. The state of an FSM is defined in a register using some encoding (examples are binary, one-hot and gray code).

In the simple case of a binary encoding, if the FSM doesn't have a number of states that match the number of total states, some values in the register are invalid. For example, a three bit register encodes 8 possible binary values, but if the state machine has five valid states, coded 0 to 4, then if the register becomes 5, 6 or 7 the FSM will stall.

One-hot encoding uses a string of bits as long as the number of states, with one and only one bit set at any time. Thus, for the five state machine discussed above, the register is five bits long and has valid values 00001, 00010, 00100, 01000, and 10000. If the register ever has more than one active bits, or none are active, the FSM will stall.

The next value of the state register is determined in the FSM as a combination of the current state value and some input signals. If an input signal changes very close to the clock edge, it might produce a glitch where the next state register attains one of the invalid values and stalls.

Developing reliable operation of the FSMs requires careful attention to details. Particularly with externally generated asynchronous signals, such as the Sector Mark value, the value may be changing too close to the clock edge and lead to stalling.

One solution is to have every input signal that generates the next state value be itself registered so that it cannot change near the clock edge. Other techniques include methods to detect invalid states and force the FSM back to some valid state. For example, a parity test of one-hot values might detect an error.

Even with auto correction to recover an FSM from a stall, the result may be an FSM at idle when it was supposed to generate some triggering output. Another FSM may be waiting for that missing output, producing a deadlock. The result is still a stalled system. This can get quite tricky.

My first task is to ensure that I force all the external signals to become aligned to the clock boundary, passing each through a chain of a few registers in a row. This is the classic cure for preventing meta-stable states but also forces signal states to only change on the clock edge.

I completed a set of two-stage D flipflops to pass each outside signal through, getting them aligned to clock boundaries. I used the schematic approach, placing 54 D FF symbols and routing the signals graphically. Previously, I inverted the signals using combinatorial logic from the input pin, but now I invert combinatorially then pass through the two stage D flipflops to get it properly aligned with the clock.

I will then look to see if any of my input signals to important FSMs are passing through combinatorial logic where they might be in transition around a clock edge. As much as possible I will remove such logic leaving all inputs clean, and if I still need the logic then the outputs will be registered so they only change on clock boundaries.

While doing this, I will strip out some logic that had been in place for functions which I will eventually implement, but am not dependent on for the near term archiving and cartridge writing roles. This includes sector update transactions, disk drive emulation and some display functionality.

As part of stripping this down, I am also reviewing several hundred warning messages from the toolchain to look for any that are truly relevant and act upon them. In most cases, these will be leftover signal declarations that are not being used any more, which I can strip out.

I spent about one day in total doing all of this, then set it up for initial testing at home with a wrap-around board of some signals. Full testing will require the disk drive and cartridges  Done for today.

Saturday, June 24, 2017

Disk tool wrote cartridge images that boot on Alto; work on CHM 1401 and Digibarn Alto continues

ALTO DISK TOOL

I spent a week chasing inexplicable behavior in the tool, such as the stalling of the WriteEntireCartridge process after the first sectors. I kept exposing signals to external pins and LEDs to see what was happening, until I found the state machine for matching sectors was hung up.

It has six states, two of which are single cycle advances to provide a short pause. I saw that the machine was not in any of the four longer term states. I then added the two short term states to the LEDs. That would either show it stalling in those one-cycle states or more likely getting wedged into an invalid state value that was none of the above.

However, with the six states displayed on the LEDs, it worked properly. I was able to write disk images to our scratch disk cartridge, first booting up and verifying the games.dsk image from Bitsavers.org and then the xmsmall.dsk image.

We took some time to play with Smalltalk, using the latter image, hoping to end up with a compelling brief demo to use on a video and at our upcoming VCF West appearance. We did find that this disk had very little free room, causing some out-of-space conditions as we played.

After we were finally sated on that, I wanted to test out the ReadEntireCartridge function using the Smalltalk cartridge which I could compare to the xmsmall.dsk image from which it was written.

The file I uploaded did not match well at all, which indicates problems in the reading function. My suspicion was that my timing was now off and I was encountering checksum errors as evidence of that bit shifting. To see this, I exposed the header, label and data checksum state bits on three LEDs and resynthesized.

Aaarrrgh. Once again the unit was stalling after reading one sector. When I pushed button 2 to command a seek back to cylinder zero, it triggered the ReadEntireCartridge state machine to start advancing through cylinders but while I was simultaneously triggering seeks.

At this point, I will have to look over my design and find ways to 'bullet-proof' the state machines. There are techniques, for example, to manually specify the encoding of the various states in the state register for a machine, so that I can ensure the register won't glitch into an undefined state. This is my homework assignment for the week.

CHM 1401 RESTORATION

The 1406 memory cabinet power supply remains out of commission. It is a 30V, 7A supply that uses a regulator card to adjust the output to a setpoint from a potentiometer, but the circuit is not regulating. We have swapped all of the transistors and checked just about everything we could think of.

I have a few more tests to attempt on Monday or Wednesday, then we may be forced to pull a power supply from a spare 1406 that we have in the warehouse.

DIGIBARN XEROX ALTO RESTORATION

We put in another full day working on the remaining bad power supply for the Digibarn Alto. It is the unit that supplies 5V for the main logic. Actually, this is a dual 5V supply, one providing high current and another providing a low current second 5V source. Neither supply inside the box is working.

Previously we had identified four bad large electrolytic capacitors and replaced them, found and replaced two silicon rectifier diodes that were blowing the PS fuse, then realized that the 18V and 5V internal power to the power supply circuitry was missing.

Today, we removed and checked the 18V voltage regulator (LM7818 chip) which was good. We then chased down a short to a section of the machine, applied controlled current to that line until we found smoke coming out of a tantalum capacitor.

Replacing that capacitor allowed the power supply to develop its internal 18V and 5V power, but it was still not delivering any output from the unit on either primary or secondary sides. We found a smaller electrolytic inside that was also bad and replaced that by 5PM. Still no output.

Marc continued that evening after Ken and I left. He hunted for and found the 23KHz chopper signal on one side, which was being blocked from driving the rest of the circuitry by a NAND gate. Note that there are no schematics for these supplies available, so that figuring out what is going on is notably harder than a normal diagnosis process.

Marc lifted the regulation line, which is what kept it cut off, seeing some unregulated voltage develop on the primary side. Secondary side is still dead. While working on this, something else failed, likely another tantalum somewhere but alternatively a failing diode or other semiconductor. The power supply is back to blowing 12A fuses.

At this point, the effort required (16 hours so far) is getting out of proportion to the value of continuing to restore this particular machine. It had many bad power supplies, as were most of the spare supplies that Bruce had available. The Alto itself is missing one logic board and one cable, at least.

The monitor was not compatible with the Alto at all, thus had never worked together as a system. The disk drive had a label inside - "smoked" - thus of suspect condition. The spare logic boards he was given with the donation contains missing chips, burned sections and other signs that they are non-working items.

This machine is a great artifact and static exhibit, but is appearing to be a collection of broken parts pulled from working systems and donated to Digibarn. That would imply extraordinary time will be needed to find and repair, if possible, all the nonworking parts that were assembled to make up this artifact.

We are therefore coming to the conclusion that this is a poor candidate for restoration and a bad use of our time, compared to restoring other machines such as the exhibit in the Xerox PARC lobby that were clearly working at shutdown or units at CHM not already restored by Al Kossow. We will finalize a decision within a week. 

Wednesday, June 21, 2017

Slogging along working on Digibarn Alto power supplies

DIGIBARN XEROX ALTO RESTORATION

We restored three power supplies to operation, replacing bad capacitors and other failed parts. The fourth supply is the big one, delivering +5V for most of the logic, and it was blowing its 12A fuse immediately on powerup. 

We discovered that all the electrolytic capacitors were bad and replaced them - fairly expensive parts to match the mounting method and available space. We then found two of the rectifier diodes were blown and replaced them. Now the fuse does not blow, but no power is produced yet.

On the board, the logic for the power supply is powered by an 18V regulator chip for operational amplifiers and in turn a 5V zener diode, that feeds off the 18V line, for TTL logic. The output of the LM7818 was zero, which can either be due to a failed regulator or a short circuit downstream. We ran out of time to work further on this unit, having spent the day restoring the other three and doing the capacitor and diode replacements. 


Friday, June 16, 2017

IBM 1130 light panel upgrade boards complete, working on Alto disk tool debugging

IBM 1130 LIGHT PANEL UPGRADE

I investigated the three bad SCR positions I had previously uncovered and discovered that in two of the cases, the issue was that the large flat contact surface (anode) didn't flow onto the PCB pad beneath, leaving the circuit open. Resoldering brought them into full operation.

Working through my big PCB, I began to check the continuity to the anode. Fortunately, the SCR type has a stub lead sticking out of the side between the cathode and gate, which I can reach with ohmmeter probes and check to see if the anode is connected to the lamp pin. I repaired several positions that had such faults and each worked perfectly after the fix.

I definitely had to repair the original SCR that is wrong, the one I tested with in my first attempt, since it won't fire until the voltage gets to an unacceptably high level on the input pin. I used my hot air rework gun to strip off the failed part and solder in a replacement. Voila, the board is now fully functional. 

I am now bottlenecked waiting for the light bulbs to arrive from China. The boards are complete but I don't have enough bulbs to load onto the boards and install them into the 1130. When I solder each lamp on the header pins, I plan to encapsulate it in silicone which will prevent the wires from ever shorting together, as that would destroy an SCR. 

I installed quite a few bulbs onto one of the boards and did a test fit to see how easily the could be fit into space without bending or damaging the lamps. The results were excellent, which implies that once I have all the bulbs on their headers and potted with silicone, placing the boards against the honeycomb will be easy. 


Bulbs on headers plugged into sockets on PCB, after test fitting into honeycomb

ALTO DISK TOOL

I worked on the testbed to check out my write cartridge code, since something is going wrong when I attempted to write an entire cartridge on the real disk drive. The logic stopped after the first sector was written and a flag bit indicated an overrun, where the writing FSM is still active when the next sector mark arrives.

I can't see any place that will write the overrun flag, so that must have been a false indication during my testing, something I misread. I concentrated on the logic that steps my transaction through writing the entire cartridge.

I don't see anything that should block the write from continuing sector by sector, so I set up for some testing, simulating the sector mark and disk status to allow my logic to run. I set up the scope to track key signals, the first of which is the WriteGate signal which defines the range of the write to an individual sector. If that is active long enough to run into the next sector mark (3.3 ms) then I may be experiencing overrun conditions.

I see the sector begin writing at the sector mark and the last word of zeroes is written at 3.168ms. The sector has over 160 us of free time before we reach the next sector mark. This reinforces my belief that we are not experiencing overruns when writing a sector.

I wondered whether I might start the next sector (sector 1) while in the midst of a sector mark interval (i.e. it was already logically 1 when I started looking for a sector match), but with the SM only 5 us long, it can't force us into an overrun situation.

I kept looking, focusing on what has to happen for the WriteEntireCartridge state machine to step through the entire cartridge sector by sector. I changed the diagnostic outputs to give me the data that will help pinpoint the cause of any problem.

IBM 1401 RESTORATION WORK

We have a document of uncertain quality that was built by field engineering specialists back in the 7094 and 1401 era, listing a number of general market transistor types that are said to match an IBM transistor number. We were missing spare 028 and 036 transistors, but the chart gave us 'equivalents' as 2N1038 and 2N456. Those in turn are in equivalence tables to the NTE numbering scheme as NTE176 and NTE104.

Using the NTE numbers, we bought a few of each type to use in repairing the voltage regulator card for the extended memory frame of the Connecticut 1401 system. This card is a differential amplifier that compares the voltage being produced by the power supply to a reference value set on a small potentiometer. The difference signal is amplified by a chain of two transistors (028 and 036) and that drives the base current of the parallel 108 transistors that deliver up to 7A of the regulated voltage.

Normally we can test transistors for signs of death using a DMM, either looking at resistance across the various junctions or using the diode tester function. There should be a one-way path from emitter to base, and a one-way path from base to collector, of the appropriate polarity, but no path from emitter to collector.

That is true for silicon transistors, but germanium ones exhibit enough leakage current that they will slightly bias themselves on, passing current in one direction from emitter to collector even with no current supplied to the base. Further, if a transistor becomes weak, having too low an amplification factor, it will still test good on the DMM but fail to deliver enough current in the real circuit.

We suspect that is what has happened in the voltage regulator card (and a second known failed card which keeps the voltage lower than the first card, but still is not able to drive it down to the set target voltage. Our bad card will allow the voltage to exceed 40 volts, while this second card keeps it down to 33V. The target is 30V, which neither can maintain. Weak amplification would explain this.

I have a Peak Atlas DCA Pro DCA75 tester that will measure amplification, leakage and other factors. I will use that when checking out any suspect transistors. 

Thursday, June 15, 2017

All boards working on IBM 1130 upgrade for light panel

IBM 1130 LIGHT PANEL UPGRADE

All boards are built and I began live testing. Lamp test works properly but for some reason I was not getting the bulb to light with the signal pin at an acceptable voltage. A single instance of the circuit built off board works, so this is a matter of interaction among circuits that I have to address. 

In the 1130, the lamp test line is hooked to all SCR gates with 6.8K resistors, while the individual signal inputs are hooked to the gate directly. The 1130 wiring has 6.2K resistors in series with all signals, thus it appears that the SCR gate is hooked to a voltage divider between the +3V logic signal and the ground level that lamp test is normally holding. 

About 1.5V goes into the SCR gate which should conduct. When lamp test is pulled up from ground to +3V level, the SCR gate conducts. The fact that I don't see the lamp lighting with the input signal is troubling. It only works when the voltage is boosted to 3.26V with lamp test floating or 3.9V when lamp test is at ground.

To hook this into the 1130, I have to accept the constraints of that system. Signal inputs are somewhat less than 3V to fire the lamp, AC supply to the SCR is 7.25V and the serial resistance with the signal inputs is 6.2K. Thus, it seemed the current boards wouldn't work. 

I moved the bulb over one position and the results were completely different! I seem to have a rogue SCR or a flaw in that one circuit. I will now populate bulbs in as much of the board as I can, set the input signal voltage to less than 2V and spot any positions that don't work as intended. Moving bulbs around will let me check out the entirety of each board.


I completed both small board checkouts. Three positions didn't work properly and need to have components replaced, but the remaining 59 worked properly with both 1.41V signal and 1.41V lamp test voltages applied. I am off to the CHM to work further on the power supply regulator card I am happy that the circuit is sound and these should work well when installed into the 1130 panel.

Tonight I only had time to test a portion of the big board, since it will require about 20 setup and test operations to move 5 bulbs carefully through all 96 circuits. I may need to float the lamp test line when not active, rather than grounding it as the 1130 currently does, since the grounding will drive the signal voltage to about 55% of its value at the SCR gate. Floating will provide all 100% of the signal voltage to operate the thyristor.

IBM 1401 RESTORATION WORK

We worked on the voltage regulator card but were ultimately stymied by the lack of any 028 and 036 transistors on hand. We have sourced them and can continue with the repair next week. 

Tuesday, June 13, 2017

Finished all 1130 light panel boards, worked on 1401 system at CHM

IBM 1130 LIGHT PANEL UPGRADE

I continued building the final large board today, completing all 96 triacs and resistors before breakfast. I continued with inserting all the lamp sockets and half of the signal pins before it was time to head over to the CHM to work on the 1401 systems. 

I did some testout of the resistors and signal pin wiring, confirming that the first 48 pins and their associated lamp test resistors were installed properly.  After I returned from the work at CHM and evening with the 6800 club at Holders, I finished up the board.
Big board completed
All boards check out, but the power on testing with the limiting resistor will be needed tomorrow.
Three boards in approximate relative position as they will sit inside 1130
IBM 1401 RESTORATION WORK

One of the 1401 systems (Connecticut machine) was down since smoke poured out of the 1406 memory extension box last Wednesday. We removed the power supply and found the part that emitted the smoke.

The power supply has two SMS cards installed, one regulating the output voltage and the other protecting against overvoltage. If the output of the power supply goes too high, a silicon controlled rectifier is clamped across the output. This technique, called a crowbar, will cause a circuit breaker to pop.

In our case, the breakers did trip but took far too long, since the twin 3 ohm load resistors limiting current in the crowbar carried 10 amps each for enough time that they scorched the board underneath and burned insulation off of nearby wires.
Trace side of crowbard card

Component side of crowbar card
The cause of the crowbar activation was the failure of the regulator card, allowing the voltage to soar up to more than 40V, instead of the nominal 30V expected from the supply. We don't have any spares for this card type, thus will have to diagnose and repair the card before we can restore the 1401 system to operation. We have replaced one transistor so far but the card is not yet working properly.

Monday, June 12, 2017

Building the IBM 1130 light panel upgrade

IBM 1130 LIGHT PANEL UPGRADE

My PCBs arrived along with the remaining components, while I was on my trip to NY. I found that I hadn't specified the right size hole to mount my turret connectors directly on the board, but I thought I had a workaround that would retain the turret connectors. It did not pan out, so I will be soldering the power wires directly to the board. 


Small board (one of two)
Large board (only one required)
I will begin to build one of the boards to test it on the 1130. First step is to solder down the surface mount resistors, as they are the smallest and closest to the board. Second step is to solder the surface mount triacs in place. Third is to mount the lamp sockets on the bottom side. Fourth is to mount the signal pins on the top side. Fifth is to mount the turret connectors. 

I am concerned about shorts in my soldered lamp holders, since the bulbs have bare wire leads. This vulnerability affected the original IBM boards and will affect mine too, destroying the Triac immediately. I have two ways to address this. 

First, I will work out an insulation scheme that protects the bulb leads and prevents possibility of a short. Second, I will put in a current limiting resistor to the AC line while I am checking out the light circuits one by one, so that I will only have one of three cases for any light circuit:
  1. The lamp lights correctly and all is good
  2. The lamp does not light due to a bad bulb or open circuit, replace and repeat test
  3. The lamp does not light because holder is shorted but the resistor protects the Triac from catastrophe
By Sunday evening I had the small panel for the far right side completed and a number of lamp holders guaranteed to be short free. First, I fitted the board into place to confirm how it sits inside the 1130 pedestal box on the face of the honeycomb. That was a perfect fit.

Trial fit of one small board against honeycomb
As you can see from the board above, not every position is used on the small boards. The first board above only has 27 bulbs out of the 48 possible positions, thus I only installed components on those 27 spots. The middle board, also a small one, has 33 lamp positions utilized The final, large board has every position implemented, a total of 96 lamps.

I began construction of the second small board, installing all the resistors, triacs and lamp sockets by dinnertime. All that was left were the 33 signal pins and the three turret connectors. Soon those were installed as well and I could move on to the final large board. A very long process, soldering 387 components, so didn't finish this evening.

Tomorrow, I will hook them up to test power with the limiter resistor and check each light circuit. Since my hot resistance of the bulb is around 50 ohms, my limiting resistor to protect against shorts can't be more than about 10 ohms if I hope to see the filaments light.

I am still waiting for my 200 light bulbs coming from China, which I will then have to solder onto the holders to plug into the sockets on my boards.

Sunday, June 11, 2017

Major progress on 1401, 1311, 729 and 1402 restorations

RESTORATION WORK AT TECHWORKS BINGHAMTON

1401 System

Our team arrived at our hotel late on June 6th, worked on the 7th, 8th and 9th, with travel home on the 10th. There were tours, picnics, interviews and other events that took time, but we did get a decent amount of time working on their equipment.

The 1401 system had been previously powered up by the local team, but it was not able to do arithmetic correctly. When we arrived we started to work on that problem. Other problems arose that had to be dealt with, such as when we lost the ability to store the A bit in any position in memory. 

The A bit problem manifested itself as a C bit (checksum) error, which we began tracing through the C bit logic until we realized that the machine was also not holding the A bit, whose absence made the C bit value incorrect.

We found a total of three cards that were malfunctioning, replaced them and had data storing properly again. We went back to work on the addition failure. The machine could correctly add 1 + 2, for example, but not 2 + 2. 

We quickly realized that we had a 'hot' 1 bit, where any arithmetic result would have the 1 bit turned on regardless of its proper value. Thus, 1 + 2 produced 3, but 2 + 2 produced a 5 since the 1 bit was erroneously set. 

We were tracing this from the adder logic itself out to the memory. The way that arithmetic works in a 1401 is that the result character of an addition (or other arithmetic operation) is stored in memory without going into the B or A register. Thus, along with the wrong value, if the 1 bit was not intended to be on, the parity would also fail. The 2 + 2 case stored a 5 (1 and 4 bit) without the C bit since parity should be odd, flagging an error due to an even parity.

The 1401 uses wired-OR logic, where multiple gates have their outputs shorted together to form an OR of the conditions of the contributing gates. This means when you have the extra 1 bit set, it could come from any of several gates that are shorted together. 

We did lots of oscilloscope work probing the state of various signals in the path from the adder to where it stores in memory. For quite a while, we saw that no set of inputs should produce a 1 output yet it was there. 

To do the scoping, we set up a short loop to set up fields for an addition, perform it and loop perpetually. We had the most success triggering the scope by a signal that is activated when the adder is ready to store its result in memory. 

The 1401 system encodes numbers as binary coded decimal (BCD) characters, but the arithmetic hardware itself uses a system called qui-binary by IBM. Thus, the input digits are converted from BCD to qui-binary, arithmetic occurs and the output digit is converted back to BCD. 

Qui-binary has a five value and a two value section, the quinary (base 5) and binary (base 2) portions. Thus, we had to find the circuitry that assembled the BCD bits from the quinary and binary states. We looked at the first gate generating the 1 bit and found that the adder was giving the proper value. 2 + 2 had only the 4 bit set, not the 1 bit. 

The 1 bit value then transitioned through a small number of gates until it reached a double negative AND gate whose two sections were ORed together and also wire ORed to several other gate outputs. This wire OR output is the drive for whether a 1 or 0 is written in the 1 bit during the current memory cycle.

The top of our double AND gate had the 1 bit value from the adder and the overall signal to write an arithmetic result to memory. The bottom had the value of the 1 toggle switch on the console and the overall toggle switch to manually enter data into memory. Thus, this double gate drives a 1 either because of manual entry or arithmetic results. 

The inputs to the manual entry section don't change unless the toggle switches are moved. The inputs to the arithmetic result section were 1 for the 1 bit value and a pulse to store. Since this is a negative AND gate, it only passes a result if both inputs are negative. It therefore should NOT write a 1 into memory.

The wired OR output of this and the other gates showed a positive pulse, writing a 1, at exactly the timing and shape of the enabling pulse for arithmetic result storing. Inputs don't meet the conditions of an AND but the output pulses. 

Swapped the card but no change. Examined inputs to all the other gates wired into this output, but none had conditions that would fire. Swapped each of the other cards just in case, but no change. Looked at the wiring on the backplane near the card. Tested the signals on the card itself, with an extender, to see if there is a socket problem. 

After half an hour of increasingly fanciful hypotheses and tests, looking for some analog issue or hidden path to drive the erroneous 1 output, the problem went away. It was the end of a workday and inexplicably the addition was no longer producing a hot 1 bit in the result. 

We could tell instantly because my looping program encounters the parity error when the hot 1 overrides the intended 0 value for that bit. This shows up as a red light in the storage block on the console panel. When that stopped lighting we checked the stored field and found that 2 + 2 was now 4, not 5. 

We came back the next morning, and extended my program to add multidigit fields, rather than a single digit for each operand. The red light flashed again while the program looped. A look at the result field showed that our problem had simply changed from a hot 1 bit to a dead 1 bit - always a value of 0. 

Thus, 2 + 2 properly produced 4 but 1 + 2 produced only 2, not three because the 1 bit was permanently set to 0. The scope went back on and we began tracing signals again. At this point, I noticed the the input to our double AND gate, arithmetic results section, was at ground potential. Since this is a T level logic signal, the only valid values are -6V and +6V.

I looked at the ALD page and saw that our input to the double AND comes from another logic compartment. The signal moved over our backplane to a paddle card that would route the signal to the other compartment. I checked continuity with a meter to the paddle card. 

Since continuity was good on the original compartment (01A3) we moved to the arithmetic unit compartment (01B3) and verified continuity over the cabling between compartments. In fact, we traced it all the way to the output pin of the card that produces the arithmetic 1 bit value. 

The output of the card was at ground (invalid level) but the input to gate was valid and correct - either a 1 or a 0 depending on the arithmetic result. We swapped that card with a spare and resolved the problem. Apparently this card was producing the hot 1 bit through some weird failure mode and got worse suddenly yielding the permanent 0 value for bit 1. 

We proceeded to check out many variants of arithmetic - different length fields, carries, and subtraction for example. After this proved arithmetic is good, we went on to check other instructions. Among the instructions tested successfully were:

  • Move
  • Compare
  • Branch
  • Branch when Equal
  • Add
  • Subtract
  • Set Word Mark
  • Clear Word Mark
  • Move zone
  • Move digit
  • Zero and Add
  • Read a card

As far as we can tell without running the complete and comprehensive diagnostic tape, the 1401 is fully operational. 

1311 Disk Drive

The 1440 system came with a 1311 disk drive that so far was only able to spin the platters. The arm could be manually pushed out over the disk surface but the heads never loaded (lowered to fly on the surface). Iggy worked on this, beginning with a careful inspection and full cleaning of the disk heads and disk pack.

He discovered a misadjusted microswitch, several missing logic cards and a few other things over the course of the three days. After one day, the drive would sequence up to the point that it moved the arms all the way to the inner cylinder, but was not jumping back to the outer cylinder and loading.

By the time we left, the drive completed its sequence, loaded the heads and was fully operational as far as we could tell with the limited testing we completed.

729 Tape Drive

Iggy pulled out one of the tape drives to work on. He found a failed microswitch that kept the vacuum pump from operating, a few other problems and then had the motor that lowers the head onto the tape fail to spin. He determined that the motor itself works but the relay to control it is not operating properly. Since he didn't have documentation for the drive he couldn't finish getting it working.

1402 Card Reader/Punch

The local team were concerned because they had found fragments of rubber belts in the bottom of the machine, but had no spares to install. Frank examined it carefully and found that the only two belts which were missing were both for the punch side. One is critical, as it drives contact breakers in time with the feeding process, but the other is only needed to move the stacker rollers for punch output. As long as one can accept that all punched cards will fall in one stacker, it isn't needed.

We were able to trigger a read reliably by issuing the appropriate 1401 instruction (op code 1) although the data may not be scanned in properly due to a premature reader stop. One cause of this is that the alignment pins to hold down the first reader block were sticking, thus not holding the brushes fully in place.

Frank was able to rebuild the alignment pin mechanism. The brushes in the 1402 are kind of scraggly, so we will send some spare brushes to this museum after we return home. Another problem was that doing a non-process runout (NPRO) operation didn't reliably trigger the read clutch, which we attribute to a problem with the relay logic that drives the 1402.

The machine has many relays which sequence through operations such as reading, NPRO, punching and handle conditions like the hopper emptying. The contacts tend to oxidize over time if not used. We couldn't look at the suspect relays because we didn't have the documentation to tell us which relays were involved. We will send the museum a relay tester that has helped us find and fix bad relays for our 1402s.

Wednesday, June 7, 2017

Working on 1401 and other gear at museum in Binghamton, NY

ON EAST COAST AND HELPING AT TECHWORKS MUSEUM BINGHAMTON

I have been out of touch for a few days while traveling and helping the teams at TechWorks! in Binghamton with their 1401, 1440 and other IBM gear. Two way exchange of advice and ideas plus a chance to see some historic IBM and other gear. One non-computer example is part of the lunar module simulator used by NASA to train the Apollo astronauts for their landings on the moon.

We are currently chasing a problem in the 1401 system here which causes Add instructions to inject a 1 bit into any result. If the result of the added characters already had the 1 bit set (was odd) then no harm, no foul. However, if the result was even, the added bit causes a parity error in the work going into memory.

The signal only passes through a few gates from the point where the adder converts the qui-binary result into BCD, with a signal called Arith 1, to the point where it drives the inhibit line on the 1 bit core planes. The way core works, inhibit lines must be active to keep a core from flipping on, leaving it at the zero state created during the readout of its previous contents. If no inhibit, then the core is set to 1 at the end of the memory cycle.

Our tools were inadequate to watch the signals and find the spot where it is failing, but we will have access to a more modern storage scope tomorrow when we hope to find and eliminate the problem.

We will also help inspect and clean some 1311 disk drives, as well as one of the 729 tape drives for the system. We have already done some inspection of their 1402 reader/punch and are finding the part number for one missing rubber belt.

Thursday, June 1, 2017

Making good progress on disk cartridge writing but not there yet

IBM 1130 PANEL UPGRADE

I continued to solder bulbs onto headers to use with my new PCBs, including one of the original size bulbs used by IBM in the 1130. It is possible that with great care I could use original size bulbs and get them to fit into the honeycomb blocks, but that won't be necessary if my newly ordered mini bulbs arrive from China.


Two mini bulbs and one original sized, installed on headers

ALTO DISK TOOL

I made some adjustments to the logic related to when the new parallel word is prepared for access by the serializer, as it will be important to having the checksum written out properly. With these changes it appears I am writing the sector correctly.

The timing is still off, with the crystal on my board running 9+% slow. I bought some LVCMOS oscillator modules and stuck one on the board, but the clock did not run with it in place. A quick look at the documentation gave me the correct pin for that clock signal - U9 - and I reran the tool chain and tested again.

The error persisted, I dug deeper and found a spare cycle in my clock FSM. I fixed it. Measuring again, I found the inter-record time (from sync bit to next sync bit) to be 144, close enough to the 134.4 theoretical timing. I will put the clock module on the other two fpga boards so that they all run at the (same) intended speed.

Next up was a validation that the checksum is working properly. To address this, I set up a spreadsheet with the eight words of the label record and calculated the XOR of those with the seed value 0x0151. It came to 0xA32C

The value being emitted by the disk tool is 0xA32C so this is working well. I ran over to Marc's to give it one live trial before I leave tomorrow for the visit to the TechWorks museum in Binghamton, NY.  The attempt to write an entire cartridge failed with an overrun error, where the write is still active when a sector mark arrives.

Digging into the disk tool writing function and building lamp holders for the 1130

ALTO DISK TOOL

I know just about where the logic is going wrong, but not yet why. I added some more diagnostic output from the fpga board and set up for testing. I wanted to see either the output data word being loaded into the serializer or the running checksum.

I can see the two correct words of the first record being written out (both 0x0000) and the running checksum is correct through these as well. It is when the logic shifts to write out the checksum that I find the value going awry. This points me right at the logic where I transition from having written the last data word and set up the checksum as the output word.

I also looked to check the data words being written out for the second (label) record and the beginning of the third (data) record. Everything was correct - the preamble and postamble, the sync word, and the contents of those records matched exactly the cartridge image I had downloaded into the fpga board.

This leaves only the checksum to resolve and I should be able to write a cartridge image with my tool and boot it on a real Alto. I hope I can nail this before I leave for my visit to the east coast, where a few of the 1401 restoration team will visit with the Center for Tech Innovation (TechWorks) in Binghamton and share ideas as they restore their 1440, 1401 and raft of peripherals.

IBM 1130 DISPLAY PANEL UPGRADE

I have begun building all the plug-in incandescent lamp units, soldering mini bulbs to a 2 pin header that fits into the sockets which will be on my PCBs. I had hoped to have enough mini bulbs to avoid any of the regular sized bulbs from the existing panel, since these have relatively tight clearances in the honeycomb blocks.

Once I looked over my inventory of bulbs, however, I found that I only had about half the quantity I need. I did find a source for a couple hundred mini bulbs of the type I need, which should be shipped from China and arrive in mid June when I am assembling the panel boards.

I have about 40 more mini bulbs on hand, which I can solder onto headers in advance of receiving the shipment from China. This would allow me to stock one or both of the smaller PCBs and do testing even if the new bulbs haven't arrived.