Sunday, December 31, 2017

Repaired Roomba wheel but have to deal with a dead battery pack

REPAIRING AN ORIGINAL MODEL ROOMBA VACUUM

I used a file to cut down the four posts on the gearbox to allow the wheel to rotate with less friction. After reassembly both wheels seemed to have similar resistance to manual movement. I turned on the Roomba and it worked a bit but the battery was too dead to continue.

Gearbox posts filed down, before cleaning and reassembly
After a long charge I ran it again, but it died pretty quickly which tells me the battery pack is clapped out. The original pack delivers 14.4V using NiMH technology and the Roomba charger is notorious for killing batteries if left hooked up too long - since the charge circuit is quite naive. 

I was going to open it up and replace the cells inside, perhaps with Lithium Ion cellas, but it is secured with triangular security screws. A set of screwdrivers are on the way from Amazon so I will continue this towards the end of the week. Once I get inside I can decide how to make a workable substitute.
Battery pack with security screwheads

Saturday, December 30, 2017

Worked on Roomba, archived large batch of Xerox PARC disk cartridges

REPAIRING ORIGINAL MODEL ROOMBA VACUUM

I received the wheels I ordered from eBay and found they are close to a suitable replacement but there seems to be a bit more friction with the new wheel. I put the Roomba together but it turned in a circle then signaled a failure with a pattern of beeps.

I took it apart as there is too much friction on the new wheel. To fix this, I have to shave some of the plastic off of the gear case housing that fits inside the wheel, allowing it to turn more freely. 

The gear case housing has four posts spaced 90 degrees apart around the case, used to screw it down onto the wheel carrier. The post is a cylinder whose radius is large enough that it rubs against the inside of the new wheels. 


gear casing showing 2 of 4 cylindrical mounting posts
The new design wheel has a thicker plastic rim for the wheel, at a somewhat smaller radius than the original design, which likely delivers a long reliable tread life for the rubber on the outside. 


New wheel on left, original wheel at center and tread fragment on right
The cylindrical post has a hole in the center along its axis, threw which a screw passes to hold the case down against the carrier. The walls of the cylinder are relatively thick and could tolerate being filed into a D-shape rather than circular profile seen edgewise. The flat face of the D will provide more clearance from the flattened cylindrical post to the inside of the replacement wheel rim. 

ARCHIVING XEROX PARC DISK CARTRIDGES

Aligning second Diablo drive and setting up the disk tool

We moved quickly through the process of aligning the replaced heads on the second Diablo disk drive, soon getting both top and bottom heads centered over the signal on track 105 of the CE (Alignment) cartridge.

Next we hooked up my fpga based disk archiving tool, powered it up and checked out its operation. This is designed to read an entire cartridge into the onboard RAM with one push of a button, a process that takes under a minute. Too, it will retry up to 31 times if it encounters any checksum errors while reading any sector.

After a cartridge image is in the RAM on the fpga board, we transfer it over USB to a PC file, process it with a Python program and the result is a disk image file that can be used with Alto simulators or transferred to a different physical cartridge.

Archiving all the cartridges on hand

We had 22 cartridges sitting in Marc's basement that had not been archived, a few of which had suffered disk crashes or had damage on the surface as we inspected them in prior sessions. We ran through all the cartridges which looked clean first, then attempted to extract the data from the damaged cartridges last.

We set up an assembly line process. Marc would open each cartridge for inspection and cleaning. Ken and I would alternately run the disk tool to create the first PC file type and process them with Python code into the standard simulator format. With each cartridge image, I booted it up under Contralto (the Alto simulator created by Living Computer Museum and Labs to verify its operation.

Marc would wipe the cartridges down with 99% isopropyl alcohol and lint free cleaning supplies. This involved removing eight screws and lifting off the bottom cover of the plastic cartridge housing, then holding the disk platter by its central metal hub while cleaning both surfaces.

Ken inserted cartridges, spun them up to a ready state, used the tool to extract the image, and so forth. After creating the first format disk file of each image, it was put on a USB and carried over to a different laptop where the file could be transformed to its final format. Ken also inspected the heads for accumulating oxide debris or any sign of a crash.

Each time the heads became visibly dirtied, we cleaned them with swabs and isopropyl alcohol until they were ready to use again. This happened a couple of times during the processing of this large batch of disk cartridges.

The initial file format for disk cartridges is a sparse structure, where each sector, head and track is placed on boundaries that are even multiples of 2. That simplifies the logic inside the fpga, avoiding any need to multiply, but the final format for simulators has each sector packed contiguously with no empty zones between them.

Thus, the python program reads the files and converts the format, essentially packing the sectors. It takes a file that is over 6MB in size as input and produces the 2.5MB final image. That file is tested with Contralto to see if it boots.

Some images did not boot, but were verified to be an image of a dual pack file format used with the Alto. That dual pack format requires both cartridges to be mounted (in two virtual Diablo drives) in order to boot up and read the contents. These single images are saved for the time when we can read and archive its mate.

A couple of images booted up with password protection, so that I couldn't look at the list of files. Ken has a task to examine and break the protection, allowing us to access that content in the archives. At least we know they boot up fine and should have worthwhile content once the password block is bypassed. 

Wednesday, December 27, 2017

Working on reader stop problem on German 1401 at CHM

CHM RESTORATION OF IBM 1401

Reader stop during Read/Print operations

The German machine has exhibited an intermittent problem for quite a while, where the 1402 reader suffers a reader stop when programs are repeatedly reading and printing. 

This was first discovered with a program that used the combination operation code 3, which is a combination of the 1 (read a card) and 2 (print a line) operations It prints a line from the print buffer at 201 to 232 then reads a card into the card buffer at 1 to 80. 

A program can use the 3 op code to read a card, move the data from 1-80 to 201-280 and then go back to issue another 3 to do it again. This tight loop would work listing some small number of cards then experience a reader stop.

We discovered that it is not only the 3 op code that fails, because a tight loop of a 1 (read), a move from 1-80 to 201-280 then 2 (print) will also fail after some number. 

Stripped to its essence, we found that a program of 1, 2, 1, 2, 1, 2, 1, Halt will fail on the last read most times it is executed. We use that code to try to debug the fault.

The logic to read a card has a section that requests a card feed but will block that temporarily if a number of conditions exist. One of these is a signal that the printer is actively using core memory to transfer the contents of 201-232 to a dedicated core memory called a print buffer. 

While the printer is moving data from main core memory to its dedicated buffer, it is triggering reads of locations 201 to 232. This could interfere with a card reader if a card is moving through the machine. That is because the card reader requires the 1401 to read locations 1 to 80 once for each row of the card. 

The reading of locations 1 to 80 while cards are reading is called a scan and each scan of 80 columns is triggered by the card moving past a row. The physical movement of the card determines when this happens, or actually the timing wheel inside the reader. 

If the core memory is busy doing printer transfers, we lose the chance to scan the 80 columns and thus any holes in that row of the card are lost. The correct action if that happens is to cause a check condition - reader stop - because data integrity was lost. 

In the code 1, 2, 1, 2, 1, 2, 1, Halt you might naively assume that each instruction is completed before the next one is executed, but print operations appear to end much earlier than the actual printing operation has completed. Thus, the next 1 (read) will be executing while the printer is still moving. 

We suspect that the reader stop occurs due to failure of timing interlocks that should block the 1402 from clutching to move a card until the printer is done transferring. 

We looked at the path from the flipflop that indicates the printer is active accessing core, through the gating logic that blocks clutching the card reader until the transfer flipflop turns off. 

Drawing of the signal path to hold off card reading during printer transfers
We could see the signal from the flipflop as an input to a +C0 gate, one that translates the +U level signal from the flipflop into a -T level signal named -T PR INTLK RD.

The -T PR INTLK RD signal flows through an OR gate (the odd triangle signal) which handles the multiple conditions that should block reading a card, then into the AND gate which gates a Feed request to produce the Read Clutch Magnet activation. 

The signal on the output of the +C0 gate, which should flow through the OR, did not seem to pulse upward in spite of the input signal pulsing downward. Admittedly we could have had difficulties with the scope trying to watch it, but it did not appear to jump up from -6 to +6V (T level logic). 

Thus it would be failing to hold off the Feed request, starting a card read too early such that its first row would trigger a scan while the printer logic is still busy reading from 201-232. 

We swapped the +C0 card but the problem persisted. We swapped the OR gate (actually it was a -A0 which is a NOR) but the problem persisted. We then turned our attention to the wired-OR at the output of our +C0 gate which comes from the Overlap logic in section 74 of the ALD. 

Looking to the overlap logic page, we found two more +C0 gates that are tied to the original +C0 from the printer. 

First anomaly we noticed is that one of the two +C0 gates in the overlap page had a pullup resistor, but so does the original +C0 in the printer logic page. A wired-OR net should NEVER have more than one pullup active, but we have found a few cases already where IBM violated this design principle, including here.

We isolated the two +C0 gate outputs from the overlap page, using a card extender, so that neither was tied to the original +C0. The behavior got much better - our test program of 1, 2, 1, 2, 1, 2, 1, Halt would work correctly much more often. 

Unfortunately, we did still encounter some reader stops. Pulling those cards improved the situation but did not correct it fully. Too, the signal on the output of the +C0 was still not pulsing upward properly. 

At this point, it was time to hand the machine over to the demo team. We have to dig further into this problem of the +C0 output from the printer logic.

In addition, we should work backwards through the logic triggering the reader stop to be sure this is caused by what we suspect. If not, we may zero in on the fault from that direction.

Friday, December 22, 2017

Repairing a Roomba vacuum, aligned Diablo drive for Alto, demonstrated IBM 705 tube module function

ROOMBA RESTORATION

I have an old Roomba vacuum cleaner, the original model, which I decided to haul out of storage and refurbish. First problem I discovered was broken tread on one of the wheels. I think otherwise this just needs to be cleaned out well - hairs and dust get wedged in the roller bearings and other parts of the machine over time. 


Segment of failed tread from Roomba wheel
Alas, iRobot no longer makes replacement wheels or tread for this model, nor do any third parties. Apparently tread failure was a common issue with the original Roomba based on various postings online. 

It appears that the next model, the Discovery series, uses a similar enough wheel assembly that I am hoping to swap parts onto my machine. I ordered tires on ebay that were removed from a Discovery model and should have them in about a week. 


Mounted wheel that lost its tread
Visually they appear to be compatible, based on the hole pattern for mounting and design, but without any dimensions I could find they are proportionally smaller or larger thus unsuitable. 
Replacement tire from discovery
I managed to maneuver the wheel out of the chassis enough to begin the replacement, once I get the used tires from the eBay seller. The machine will sit here waiting for the parts, although I can spend some cleaning out dust and hairs from the rollers and other parts while I wait.


Wheel out of chassis and ready for replacement
SESSION AT MARC'S HOUSE

Aligning Diablo drive to continue archiving Xerox PARC cartridges

We set up the oscilloscope and the setscrew in the Diablo drive, ready to align the disk heads. This works by viewing the signal from a specially recorded Alignment Cartridge. That cartridge has one pattern recorded at Cylinder 100 and a different one at Cylinder 105. The latter location is the one for the 2200 bpi disk drives such as the one used with the Alto.

Label on special alignment cartridge for signal to match on scope
The heads are held loosely in place with only slight tension on the lockscrews. The head assemblies have a diagonal notch that is pushed by a setscrew, thus pushing the head outward towards higher track numbers as the setscrew is advanced.

The initial location of the heads is about 5 cylinders too low, with the setscrew moving the head over a range of at least 10 cylinders. Watching the scope, we saw no signal until we neared the proper cylinders. Of course, we had to skip over the incorrect signal on Cylinder 100 in order to move the head onto the proper signal for our drive type.

It wasn't hard to get the heads placed at the right point, where the special signal produces a symmetric pattern. When off center, one lobe or the other is larger, thus alignment is achieved when the peaks are balanced.

We hooked the drive up to our Xerox Alto, put in a cartridge from Parc that we had previously archived, and successfully booted it up. To ensure we had the alignment right, we ran the program scavenger which double checks all files using both the file directory and the label fields of each sector. 

Results were perfect and the heads remain clean as a whistle at the end of the session. Next week we will start up our production line to archive more of the Xerox PARC cartridges.

Demonstrating function of IBM 705 computer tube based module

I had a module I bought years ago on eBay that Ken Shirriff took to work out the function implemented by the unit. Based on the circuit and voltages used, it would have been in either a 702 or 705 computer of the mid 1950s. Since the IBM 705 was sold in much higher numbers than the 702, we assume it was in the 705.

Tube module from IBM 705 computer, powered and under test
The circuit as he worked it out provides five independent debouncers and five cathode followers. The debouncer takes a signal from a mechanical switch or relay contact and removes the rapid multiple contact and release that occur when the contacts are changing from on to off or vice versa. These multiple rapid changes are referred to as switch bounce.

If a switch is intended to step a computer to execute one instruction per press, but the contacts have bounce, then it might actually execute multiple instructions. A debouncer will detect the first change in state on the contacts and ignore any changes for a short interval of time. This converts the switch activation to a single output when a human activates it one time.

We cobbled together multiple power supplies and cables to provide the module with multiple voltages required by the design. We couldn't provide exactly the levels intended, but the maximums we could attain were likely to work according to an LTspice circuit simulation run by Ken.

Cobbled together demonstration of the tube module
The module used 6.3VAC to power the two filaments in each of the eight tubes. Each debouncer uses the two triodes in an individual tube, accounting for five of the eight tubes installed. The cathode follower requires only one triode of the pair in a dual triode tube, thus five cathode followers need only 2.5 dual triode tubes.

The filament power burned about 27W of power, while the remaining circuitry on the module would average around the same for a module total of just over 50W. A computer with 1,000 modules like this would draw 50KW of power, more when you consider the losses in power supplies that deliver the needed voltages.

The module used +140V, -60V, -130V, and 48V but we could provide only +120V, -60V, -120V, and 30V with the boxes we had on hand. To show the results, we routed both input and output of the debouncer to an oscilloscope and in parallel to a pulse counter.

When pushing the input button to route 30V into the tube module, we saw typical switch bounce on the scope and our counter would jump several digits for each push. The output of the tube module, on the other hand, had a single well defined pulse for each switch push and the counter advanced by exactly one.
Input on bottom with switch bounce, top is output of circuit
One last part of the circuit was an output to drive a neon bulb. When we wired up a bulb, we saw it light up when the switch was pushed and extinguish after the switch was pushed. We had demonstrated the functions of the module and Marc has enough material to edit together an interested YouTube video.
Neon bulb driven by circuit
Ken has a similar module, although his is from an IBM 709. The module he owns houses one bit of each of three registers in a dynamic memory. That is, the register requires a 1 MHz clocked signal to retain the state of each bit. It also has combinatorial logic gates to route signals and control the registers. This module is part of the arithmetic control unit which is at the heart of a scientific computer like the 709.

It requires even more voltages, and somewhat different values, than the 705 module we just demonstrated. We have to provide two 40V one megahertz clocks with a fixed phase difference, plus develop some logic inputs to show off the module function to best effect. That is a future project. 

Wednesday, December 20, 2017

Both IBM 1401 systems fully operational after work today

REPAIRING IBM 1401 SYSTEM AT CHM

A couple of the team members attacked the door latches for the 1402 card reader/punch, which use cables to transfer a pull on a pivoting handle, inset into the door, to rotate a keeper plate out of the way of a fixed tongue on the machine frame. With the keeper pulled up, the door can be swung open by the handle. 

The cable became too loose, thus unable to pull the keeper out of the way. The door remained latched closed - actually both the left and right doors suffered the same malfunction. The team was able to reach through the top openings of the machine and get down to snag the cable, pulling it enough to open the doors. With them open, adjusting the cable tension properly was an easy task.

Meanwhile, we focused on the 1401 system which refused to power up. This is the one which originally suffered some failure that knocked out the bottom 4000 characters of memory, but during the repair of that fault, a bad card inserted into a socket caused something to fail in the power supplies.

We had found last week that the -6V power supply was producing no power. Some component testing identified one shorted power transistor and a couple that seemed suspect.

I replaced the six power transistors in the supply and did a quick bench test with 110V supplied. The power supply looked good so we reinstalled it into the machine at the end of our session last week. Alas the system still didn't power up. 

Today we saw that -6V supply was still missing, but a quick check of the input 110V supply showed that it wasn't recieving input power. The input to this supply is routed through one of the power sequencing relays which hadn't activated.

Starting at the beginning, with relay 1, we should see the +6V and +30V supplies verified. The relay coil is hooked to +30V on one side and to the power sequence card on the other. The sequence card will ground the other side of the coil if a transistor switches on. That transistor has a resistor network acting as a voltage divider between +6V and -20V. Thus, if both +6 and -20 are active, the transistor is biased to conduct and will switch on relay 1.

We tested the voltage coming through relay coil 1 and found it was not present. The +30V supply was working fine, but 30V didn't come through the coil.  Tracing voltages, we realized that the 30V was interrupted somewhere. Our schematics for the machine showed a direct wire to 30V, but clearly that wire wasn't working properly.

Looking over all our schematics, we found that the German 1401 system has a circuit between +30V supply and the relay 1 coil, not simply a wire. Digging further, we found other schematic pages showing the same circuit for the Connecticut system. The circuit was labeled as the 18V differential memory supply. Since our failure came with a bad card in the memory circuitry, this seemed promising.

Looking closely at the fairly simple power supply circuit, it highlighted the bizarre way that IBM sometimes labels things on their machines. The circuit is fed +30V from the power supply, returns the +30V to the relay 1 coil, and drops the voltage down to +12V to feed the memory driver cards.

So, a circuit that drops 30V down to 12V is called an 18V differential supply - as 30 minus 18 is 12. It has a potentiometer to make adjustments, but the label next to it says 18 when in fact you adjust it for 12V. Words fail me..

We verified that +30V went into the supply, but the 30V back to the relay coil was missing. One schematic showed a fuse, directly between the 30V input and rest of the circuitry. We pulled the supply out, looked at a seemingly intact glass fuse but the VOM quickly confirmed that it was blown.

With the fuse replaced, the machine came right up. Memory works, power works and all is well. In hindsight, the -6V supply that we repaired was probably limping along adequately on the 5 of 6 working transistors, but it is now fully refurbed and at factory strength. If we found the 18V differential supply fuse, hidden deep inside with no signs, we might never have looked at the -6 supply at all.

Saturday, December 16, 2017

Miscellaneous projects and tests

SESSION AT MARC'S HOUSE

We had a short session on Friday and consequently only had time for a few tasks.

HP 2116 in rack

We used a hydraulic lifting cart I brought over to lift the extremely heavy HP 2116 computer up high enough to slide into the rack. It may weigh 200 pounds and is unwieldy to lift. The computer is installed now, ready to begin restoration activities.

It was cleaned and examined closely. We found it had 16K words of core memory installed, which is the largest configuration that fits solely in the processor cabinet. Some machines could be expanded up to 32K words but the extra memory had to be housed in an additional module.

IBM 705 (or 702) tube module experimentation

I have a tube module from my private collection which we believe was used in an IBM 705 commercial computer. The module could have been used in the 702 system but the total numbers of 705 systems dwarfed the shipments of 702 machines, thus it is most likely from a 705.

The unit mounts eight vacuum tubes on a frame with components wired below the tube sockets. Man types of these existed, but we worked with the one that I had bought years ago on ebay.
An example of a module from a 702 or 705 system
Ken Shirriff traced the wiring and determined that my module implements five contact debouncers and five cathode follower driver circuits. The picture above is NOT my unit. Ken also owns a sample module, but his is more like the one pictured, having glass diodes and implementing a number of logic gates.

We are going to hook up the module and demonstrate it operating, taking a pushbutton or other noisy mechanical contact input and producing a cleaned up single pulse output. The cathode follower stage will drive a neon lamp, a typical role of that circuit in a 705 system. Think of this module as supporting operator console pushbuttons and display lights.

We will use -60, -120 and +120V power supplies to energize the circuits. Our input will be a lower voltage source through the switch, perhaps 10 or 20V. We expect to take pictures of this and release a Youtube video. If this goes well we will look at a more ambitious demonstration of the logic gates from Ken's module.

Alto Ethernet Tool tests

Al Kossow built a few of the ethernet tool units and came over to get Ken's latest firmware and test them out on the Alto. Ken had discovered the need for an impedance matching resistor on his boards, something not included in the plans used by Al, because when Ken used longer ethernet cables the Alto failed to reliably detect signals from his board. With the resistor added to Al's modules, they worked just as well with a variety of ethernet cables.

One of Ken's boards, prior to install of terminating resistor
The tool will do routing of communications but still lacks bridging, where the Alto hooked to a tool can communicate with other Altos (and tools) acting as part of a single Alto network segment. Routing will allow different network segments to communicate, a related but different functionality. Once the bridging support is working it will be flashed into all the network boards in use.

Wednesday, December 13, 2017

HP 2645A tape units tested, IBM 1401 memory problem diagnosed

HP 1000 SYSTEM RESTORATION

Testing 2645A CTUs with RTE

I brought up my system this morning to test out the ability to access my minitape cartridges (CTUs) on the 2645A terminal. They were generated as logical units 16 (left) and 17 (right). I successfully wrote a file from RTE down to the tape unit then was able to read it in local mode on the terminal.

Terminal displaying downloaded file from left minitape unit
For some reason, my right CTU is stalling when I insert a tape. It won't respond to rewinds or reads, instead flashing a 'STALL' error message on the terminal. I will need to look into this at a later date.

I tested my 2622A terminal from the tape diagnostics and verified that it worked properly, but when I tried to use the second terminal under RTE the I/O hung. I suspect it is something like ENQ or other protocol issues that are not configured properly (yet) on the terminal.

As I have mentioned before, RS232 based communications are an enormous pain in the rear because of the many permutations of signals and protocols that can be used and the inadequate documentation in most manuals about which options were chosen.

REPAIRING 1401 MEMORY OPERATION AT CHM

Armed with my diagrams and other research on how the memory system worked I arrived to work on the problem on the Connecticut 1401 system. The first 4000 characters of memory appear as all zeroes, with a parity error, regardless of the address within the range.

The likely area of failure was in the logic that drives pulses through the matrix switches in order to generate pulses on the X and Y wires to a core on each core plane. We checked out the timing and control pulses that are inputs to the drivers, then the performance of the bias circuit that adjusts drive current to fit the temperature in the core stack. All was good.

Next up was to look at the voltages on the terminating resistors at the other end of the driver circuit. The driver runs through a row or column of the 50 to 80 transformers in a matrix switch, but the other end of the winding on those transformers are completed to ground for only the one transformer that matches the current address.

Overall there are four drivers in the memory. Two are associated with the 50 transformers that produce the X select lines for the core planes, the other two are associated with the 80 transformers that produce the Y select lines.

We watched the pulses showing up at the terminating resistor for the specific address we were accessing, comparing the good German machine against the failing Connecticut system. We found that one of the drivers was not working. That means that an X or Y select line was never activated, because a row or column of a matrix switch had no current. The transformers in the matrix switch only work if both the row and column of that transformer have current flowing.

We verified this by borrowing the four driver cards from the German machine and demonstrating that the Connecticut machine now worked properly. We replaced the original Connecticut cards one by one until we were sure we spotted the one failing card.

Ken Shirriff dug into the card and found an open inductor that caused it to fail. He found a spare coil on a donor card and moved it over to repair the failed card. In the interim, other members of the team found a few spare cards for the driver card.

Oddly, the spares are had only a single transistor while the cards in the machines have a pair on each card. The card type is AQW or AKA, the two names are synonyms for identical cards. The schematics for the AQW/AKA card matches the single transistor versions, not the cards in the machine.

The team decided to insert one of the new versions of AQW/AKA into the CT machine, while Ken was repairing the original type card. With the card in place, the power on button was pushed but the machine refused to sequence up.

We pulled out the new type card and put in Ken's repaired card, but the machine still refused to power up. No circuit breakers were tripped on the many power supplies in the machine. We used a multimeter and found that the -6V power supply was putting out less than a volt.

The power supply was pulled and tested in the workroom. We found one clearly shorted transistor and a couple others that were suspicious. These power supplies have six power transistors operating in parallel to handle the 12A maximum load this unit can handle. While it was open, we decided to replace all six transistors with new equivalents.

We found it produced -6V when we hooked it to the power mains, but we didn't put it under any load. Instead we put it back in the 1401 and tried to power up again. Still no luck and the -6V supply is again indicating about a volt. We ran out of time as the public demonstration of the systems was starting. Next week we will test the power supply more and hopefully get the Connecticut system fully back in operation. 

Tuesday, December 12, 2017

Detailed diagnostic work on failing 1401 memory, not yet repaired

IBM 1401 MEMORY PROBLEMS AT CHM

I went into the museum today to try to diagnose the memory problem afflicting the Connecticut 1401 system. The first 4000 words (the base core stack) is not working properly. No matter what pattern you attempt to write into that portion of memory, you get back all zeroes which triggers a parity error (Check error in IBM speak).

The memory stack consists of multiple planes of cores. Each plane has 4,000 locations each with a tiny circular core supported by wires through its center. Straight wires running in the X and Y directions are the primary support and are used to address an individual core by sending a current through one X and one Y wire simultaneously.

In addition to the straight X and Y wires, long wires are snaked through every core in the plane. A sense wire will detect if any core in the plane flips from one magnetic orientation to another, as that change of the magnetic field around the core induces a small pulse of current in the sense wire.

Each location in a 1401 consists of a character - binary coded decimal - constructed of the 1, 2, 4, 8, A and B bits. Another bit called the wordmark is used to delineate the end of variable length fields in the machine. Finally, the C or check bit is used to enforce odd parity for the word as a way of detecting errors in memory.

When a current runs down one of the X or Y wires, it is insufficient by itself to change the magnetization of the cores it runs through. However, the combination of the current from both an X and a Y wire is enough to flip the magnetic orientation. Thus, the current running down one X and one Y wire will not affect any core on either wire except for the one that is supported by both.

When the current through the X and Y wires runs in one direction, the core it addresses is flipped so that the magnetic orientation is considered a binary 1 value. If the current through X and Y flows the other way, the core will flip to the opposite orientation which is considered binary 0.

There is no way to detect which way a core is magnetized in this scheme except to deliberately flip it to a known state. The sense wire will see a pulse if the core was previously in the opposite orientation, otherwise not. Therefore core memory uses a destructive readout method. The core is flipped to binary 0 and the sense wire detects if its prior state was 0 or 1.

We don't want a memory that can only be read a given addressed core one time, so the destructive read is followed immediately by a write cycle to restore the state of the core. Sending current through X and Y to set the core to 1 will restore its value, but if the core is supposed to hold a 0, then we don't want to flip it on with our current.

Enter the inhibit wire snaking through all the cores in a plane. If a current is applied to the inhibit wire in the reverse direction of the current flowing through the X and Y wires, then the sum of the currents is reduced below the threshold that will flip the bit to 1. We inhibit any core that we want to be a 0 after the write cycle, otherwise the write cycle flips them to 1.

The heart of the machine is a cycle determined by magnetic core speed. In the 1401, it is a 10.5 microsecond interval, where the cores are all flipped to 0 at the beginning of the cycle, the sense wires are used to set a register with the prior state of the core, then the second half of the interval is used to write the cores to 1 with the register controlling the inhibit wire.

To get the symptoms we are experiencing, all 4000 core locations of all 8 planes must either be written to 0 because the inhibit wire is active or get sensed as 0. This would need to be a single point failure to cause the problem - for example it can't be the X and Y driver cards as there are one per plane, routed through a switch that determines which X or Y wire the current passes through.

Similarly it can't be a sense amplifier failure or bad inhibit drivers as there are one per plane for these also. What is common to all planes that could cause the entire stack to fail? We had to search for the cause.

We checked that all power supply levels were correct. We checked activation signals that cause the drive current to flow for a particular stack, 2000 character half of a stack, X and Y address. We tested the output of sense amplifiers; they were indeed 0. We did determine that the inhibit drivers were off for the bit that should be written to 1.

We tested one side of the driving current - the switched 8 x 10 array to select one of 80 parallel lines in each plane. We did see the current flowing in general but didn't look at the individual wire to see that it was different from the other 79.

We still need to check the other side - a switched 5 x 10 array that selects one of 50 parallel lines in each plane, these at right angles to the original 80 lines. Together these select the individual core in each plane which is the coincidence of one of the 80 lines and one of the 50 lines. We also need to verify that the direction is correct for X and Y lines.

The testing ran for many hours Monday, including a side by side comparison of the same signals on the correctly working German and the failing Connecticut 1401s. No differences were found so far and no anomalies. We will continue this on Wednesday.

Sunday, December 10, 2017

Put the new disk image through its paces on the HP 1000 - real machine quite slow

HP 1000 SYSTEM RESTORATION

Working with RTE IV B

The disk images run under the simulator have the bytes of their words swapped compared to the ordering in disk images on real HP hardware. The PC uses little-endian words while HP is big-endian. I ran the DD utility to swap the bytes back, creating a new disk image to run on my physical hardware.

All is not well. My image booted up but the system prompt showed up on the 2622A terminal, not the 2645A that it should. Perhaps I have the cables backwards - I will investigate. Even so, the system came up and I issued some commands to check out its functioning.

Time to compile and run my Algol test program. I have a procedure to handle the Algol compiler, which is an older bit of software that does not work properly when handed names of files. Instead, it needs to have the source set up as a logical unit and the output produced on a temporary disk area which can then be saved as a file.

It all ran well. I put the system through some paces, running various compilers and programs. The one thing that wasn't working right was the second console, the 2622A, but I think I pulled wires off the connector as I moved the cables to put the 2645A as the system console and the 2622A as the secondary.

Diagnosing bad 12966A BACI card

Preparing to use the logic analyzer to spot the problems with the serial card is complicated by space and power cabling concerns. Where I have the HP machine in my garage, the rear of the rack enclosure where I will access the IO cards is near the main garage door, leaving little room for access.

I have no space for a table to sit the logic analyzer, and the PC hosting the HP Drive emulator is sitting on top of the rack taking up what little space there was. Further, I don't have a power strip nearby with enough open sockets to handle the logic analyzer, PC and its monitor.

Once I sort out these logistics, which are the pesky issues remaining, I will still have to program the logic analyzer and locate a source for the system clock I will need to run the analyzer in clocked mode. 

Saturday, December 9, 2017

More RTE IVB exploration and refinement, PASCAL made operational

HP 1000 SYSTEM RESTORATION

Working with RTE IV B

I set up the BASIC interpreter on my system and configured it to be readied at startup. I feel that I have finally mastered the concepts behind programs and how they are stored and accessed in RTE, thanks in part to excellent explanations from Dave Bryan. 

When the BASIC system is generated, the LOADR program links together the relocatable code and any routines from libraries, producing the programs BASIC, BASC1, BASC2, BASC3, BASC4, BASC5, BASC6, BASC7, and BASC8. 

The numbered programs are overlays that are fetched as a means of dealing with a limited memory size. The program is segmented so that after one logical set of code is complete, the next segment is loaded and executed.

The LOADR has placed the code for those 9 programs in temporary space on a disk cartridge and built an ID segment in memory that defines the program and points to the disk based code. These ID segments are by default temporary, meaning they and the reserved disk space go away when the system is rebooted.

If a RUN command is issued for BASIC after loading, it finds the name BASIC in the temporary ID segments and executes it. Any call for an overlay will find that name in the temporary segments and cause it to be run. After a reboot, with no temporary ID segments, the attempt to run BASIC will result in a No File Found type of error.

A command :SP will store the code from a temporary ID segment into a permanent named disk file of type 6. If this is done for all nine program names, then BASIC and its segments are kept permanently over reboots.

However, a type 6 file is not kept in the ID segments, thus after a reboot the RUN command won't find the name BASIC. It has been written so that if it searches the ID segments and a name is not there, it then looks for a type 6 file of the same name. The type 6 file is read to create a temporary ID segment which does point at the code in the type 6 file. 

Thus, RU,BASIC after a reboot will appear to run BASIC the same way it does after the initial generation with LOADR. However, it is only the name BASIC which is copied into a temporary ID segment. When the program tries to load and run one of its eight overlays, the name is NOT found in the temporary ID segments and a failure occurs. This in spite of the fact that the names are in type 6 files. 

Thus, for programs with overlays, such as BASIC, FTN4 and ASMB, the type 6 files for their overlays are not enough and a temporary ID segment must be created. A command :RP will find a type 6 file and create a temporary ID segment, the perfect solution. 

If you do an :RP for every overlay segment after a reboot, the temporary ID segments are set up. When the main program is executed with RU,BASIC, the main code is also put in the temporary ID table and every overlay is waiting in the table for when it is needed.

The convention for handling this is to make a file of the :RP commands, name it the same as the main program but with a forward slash (/) prefix. The /BASIC file is executed as part of the startup WELCOM file, recreates all the temporary entries for the BASIC system, and now it will run properly if called with RU,BASIC.

In addition to the temporary ID segments, during system generation of RTE IVB, the administrator can load programs into the ID segments as permanent files. Those use some space on the disk cartridge which is not associated with any file name, but is pointed to by the ID segment. 

In addition to these disk based permanent programs, whose name is not visible in any directory list but which will execute with a RUN command, the administrator can load some programs permanently in memory. The code takes no space on disk at all (other than as part of the system to be loaded at bootup), does not show up in any directory listing, but will be executed by RUN

Just to round out the situation, I should mention that when creating a program with the LOADR, an option can be entered to make it permanent. This sticks the produced code in one or more tracks on the disk cartridge, without a name in the directory, then makes a permanent ID segment which persists across boots.

This was confusing at first, when I could issue RUN commands for program names which did not show up in any directory. Further, you can see some programs stored as type 6 files, but if the overlay segments were not set up by :RP then the program would fail to run properly. 

Now, however I grasp how it works. I can look at the ID segments through a special LOADR option - thanks to Dave Bryan for the tip about this. Below is the output of that on my system;

:RU,LOADR,,,,LI
  
  NAME  TY PRIOR LMAIN HMAIN LO BP HI BP  SZ   EMA MSEG  PTN  TM  COM S-ID
  
  
  PRMPT  1     1 26000 26532     4    11                          NC      
  D.RTR  1     1 26532 46655    11   257                          NC      
  R$PN$  2     1 24000 24760     2    14   2                  PE  NC     0
  LGTAT  2    10 24000 26272     2    55   3                  PE  NC     0
  T5IDM  2    10 24000 35450     2   220   6                  PE  NC     0
  WHZAT  2     2 24000 30605     2   150   4                  PE  NC     0
  ,,,,,  3    99 24000 30136     2   226   4                  PE  NC     0
  FMGR   3    99 24000 32367     2   112  14                  PE  NC     0
  HELP   3    99 24000 32576     2   103   5                  PE  NC     0
  LUPRN  3    99 24000 51500     2   422  12                  PE  NC     0
  LOADR  4    99 10000 41062     2  1274  28                  PE  NC     0
  EDIT   4    51 10000 30722     2   604  28                  PE  NC     0
  RT4GN  4    90 10000 32351     2   473  28                  TE  NC     0
  FMG01  3    99 24000 32367     2   112  14                  TE  NC     0
  ALGOL  3    99 24000 37530     2   466  15                  PE  NC     0
  FMG10  3    99 24000 32367     2   112  14                  TE  NC     0
  TG00S  5       24401 26543   265   335                      TE          
  TG01S  5       24401 26201   265   333                      TE          
  TG02S  5       24401 26124   265   322                      TE          
    <LONG  BLANK ID>
    <LONG  BLANK ID>
    <LONG  BLANK ID>
    <LONG  BLANK ID>
    <LONG  BLANK ID>
    <LONG  BLANK ID>
    <LONG  BLANK ID>
    <LONG  BLANK ID>
    <LONG  BLANK ID>
    <LONG  BLANK ID>
    <LONG  BLANK ID>
    <LONG  BLANK ID>
    <LONG  BLANK ID>
    <LONG  BLANK ID>
    <LONG  BLANK ID>
    <LONG  BLANK ID>
    <LONG  BLANK ID>
  FMGR0  5       32367 44036   112   377                      PE          
  FMGR1  5       32367 45035   112   513                      PE          
  FMGR2  5       32367 45504   112   630                      PE          
  FMGR3  5       32367 43553   112   322                      PE          
  FMGR4  5       32367 42744   112   426                      PE          
  FMGR5  5       32367 36217   112   255                      PE          
  FMGR6  5       32367 44600   112   510                      PE          
  FMGR7  5       32367 43611   112   277                      PE          
  FMGR8  5       32367 43477   112   365                      PE          
  FMGR9  5       32367 43477   112   274                      PE          
  FMGRA  5       32367 44604   112   404                      PE          
  FMGRB  5       32367 45046   112   424                      PE          
  EDIT0  5       31324 52607   604  1234                      PE          
  EDIT1  5       30722 50141   604  1115                      PE          
  EDIT2  5       33111 44043   604  1046                      PE          
  EDIT3  5       30722 36572   604   760                      PE          
  EDIT4  5       30722 40617   604  1113                      PE          
  ALGL1  5       37530 40174   466   471                      PE          
  F4.0   5       27356 33326   574   754                      TE          
  F4.1   5       27356 34150   574   755                      TE          
  F4.2   5       27356 30355   574   632                      TE          
  F4.3   5       27356 31645   574   715                      TE          
  F4.4   5       27356 34270   574   766                      TE          
  F4.5   5       27356 33254   574   734                      TE          
  ASMB0  5       24425 32310   437   571                      TE          
  ASMB1  5       24425 26663   437   571                      TE          
  ASMB2  5       24425 26555   437   537                      TE          
  ASMB3  5       24425 25352   437   454                      TE          
  ASMB4  5       24425 25755   437   454                      TE          
  RT4G1  5       33200 34547   473   570                      TE          
  RT4G2  5       33200 36700   473   733                      TE          
  RT4G3  5       33200 37143   473  1052                      TE          
  RT4G4  5       33200 36301   473   751                      TE          
  RT4G5  5       33200 36576   473   753                      TE          
  RT4G6  5       33200 36326   473   734                      TE          
  RT4G7  5       33200 36334   473   612                      TE          
  RT4G8  5       33200 34045   473   541                      TE          
  SWSG1  5       41542 42647   547   612                      TE          
  SWSG2  5       41542 47313   547   675                      TE          
  BASC1  5       15634 21524   211   462                      TE          
  BASC2  5       15625 26003   211   433                      TE          
  BASC3  5       15635 22215   211   503                      TE          
  BASC4  5       15626 30427   211  1007                      TE          
  BASC5  5       15644 26347   211   567                      TE          
  BASC6  5       15644 23515   211   503                      TE          
  BASC7  5       15646 23105   211   455                      TE          
  BASC8  5       15625 21352   211   424                      TE          
  
      17 FREE LONG IDS,      0 FREE SHORT IDS,    10 FREE ID EXTS 
  
   /LOADR:$END

The in memory entries are at the top, then the disk based ones are listed with either PE for permanent or TE for temporary. The TE entries were made by the /xxxxx files that issued :RP commands or by RUN,xxxx that copied from the type 6 files, while the PE entries were either loaded with the PE option or created during system generation.

The LGTAT utility program will identify disk space used both for permanent and temporary code that exists without a file name in the directory. Only the system (and optional auxiliary) cartridge can hold the system and the non-file disk based programs. Below is the output of that program on my system.

TRACK ASSIGNMENT TABLE       & =PROG ^ =SWAP  
  
TRACK     0      1      2      3      4      5      6      7      8      9    
   0    SYSTEM SYSTEM SYSTEM SYSTEM SYSTEM SYSTEM R$PN$& WHZAT& FMGR & FMGR1& 
  10    FMGR2& FMGR3& FMGR4& FMGR5& FMGR7& FMGR8& FMGR9& FMGRB& HELP & LUPRN& 
  20    LUPRN& LOADR& LOADR& LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  
  30    LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  
  40    LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  
  50    LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  LIBRY   ENTS  
  60     ENTS  D.RTR  EDIT & EDIT & EDIT0& EDIT1& EDIT1& EDIT3& SYSTEM ALGOL& 
  70    ALGOL&   --     --     --     --     --     --     --     --     --   
  80      --     --     --     --     --     --     --     --     --     --   
  90      --     --     --     --     --     --     --     --     --   T5IDM^ 
 100     FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP   
 110     FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP   
 120     FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP   
 130     FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP   
 140     FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP   
 150     FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP   
 160     FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP   
 170     FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP   
 180     FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP   
 190     FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP   
 200     FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP   
 210     FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP   
 220     FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP   
 230     FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP   
 240     FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP    FMP   
 250     FMP    FMP    FMP    FMP   D.RTR       

We can see from the listing above that I have a few permanent programs on the disk:

  • EDIT and its segments (different from EDITR which is a type 6 file)
  • D.RTR
  • ALGOL
  • T5IDM
  • FMGR and its segments
  • WHZAT
  • LOADR
  • R$PN$
  • HELP
  • LUPRN

These persist from boot to boot. Some memory resident programs exist but won't be seen on the disk, for example PRMPT which issues the prompt to any online terminal and schedules the disk resident program R$PN$ to process the commands. There are no temporary programs on disk in this view, 
but after I loaded the KEYS program with the LOADR, this is the relevant snippet of the LGTAT listing.

  50    LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  LIBRY  LIBRY   ENTS  
  60     ENTS  D.RTR  EDIT & EDIT & EDIT0& EDIT1& EDIT1& EDIT3& SYSTEM ALGOL& 
  70    ALGOL& KEYS & KEYS &   --     --     --     --     --     --     --   
  80      --     --     --     --     --     --     --     --     --     --   
  90      --     --     --     --     --     --     --     --     --   T5IDM^ 

We can see that KEYS is loaded here and also has a temporary ID segment allocated to it. If I were to issue a RUN,KEYS it would be executed, but a DL,KEYS won't show any file named KEYS in the directories of any online cartridge. 

:RU,KEYS

ENTER ONE OF THESE FUNCTIONS: [CREATE,MODIFY,OUTPUT,LIST] 
OR PRESS [RETURN] TO TERMINATE THIS PROGRAM:

The command :OFF,KEYS releases the temporary ID segment and also freed up the disk tracks holding the program. Having previously issued a :SP,KEYS command, I first had copied the code to a type 6 file named KEYS. Now, issuing RU,KEYS will first check the ID segments, then find the type 6 file and create the ID segment before running the program again.

I then turned to installing the PASCAL compiler and runtime system. It would be the last of the languages and utilities I needed on my system. Unfortunately for me, I ran into a snag. One of the three parts of PASCAL to be generated (the PCLF loader file) takes up so many entries as it builds the temporary code that I run out of segment ID entries in the system.

I may have a temporary solution, however. I would need to blow away some of the temporary programs in the non directory area, then do the LOADR of PCLF, then get back those programs. I create many ID entries by running the /progname files from WELCOM, to convert type 6 files into temporary entries.

Since I only need to run the LOADR and FMGR to accomplish the generation of the PCLF program entries, I can strip out all the other temporary files manually. FTN4 or other programs that depend on this will no longer run until I reboot or rerun the :RP commands, but I won't need them until I reboot.

All that done, I reran the LOADR to generate from the PCLF file. Now, with enough ID segment space, I ran out of disk space. That is, the 28 tracks available on the system cartridge aren't large enough for all the code that will be stored temporarily during the LOADR session. 

The solution is to blow away my system cartridge 002 and rebuild it with the directory located at a higher track number, leaving more temporary space. At first I thought I would have to recreate pretty much every program if I did that, a task not worth contemplating as it would be a many many hours of work.

Instead, I discovered that the :IN command could reinitialize the director in a new space, leaving all the contents below the directory untouched. What I did to accomplish this was:

  1. Copy all files on 00002 to the SC cartridge using :CO command
  2. Remove all the temporary ID entries pointing to the disk using :OFF commands
  3. Reinitialize with :IN,mpw,-2,00002,SYSTEM,150,2 where mpw was my master PW
  4. Copy all files from the SC cartridge to the new empty 00002 cart using :CO
I then reran the PASCAL step PCLF to generate the many segments of PASCAL, now that I had plenty of scratch space and lots of unused ID segments. Once it was complete, I converted all the entries it had created as temp ID segments into type 6 files, wrote a /PASCAL file to issue :RP for all overlays, and killed off the superfluous temp segments with :OFF commands.

I tested PASCAL, which worked fine, then modified my WELCOM file to run /PASCAL and rebooted as a final check. The compiler works fine. I entered a test program and ran it successfully as well. The last test was to update the WELCOM file and reboot. 

ARRGH. So close. I forgot to save one of the 20 segments of PCL as a type 6 file, therefore after the reboot I no longer had a working PASCAL environment. I shall have to go through the entire regeneration of the PCLF portion again, save just that one file, and then I will be back in business.

Another thing I learned is that PASCAL can't be loaded (via a startup file with :RP statements) because the 20 segments on top of everything else in the system will exhaust the pool of ID segments. Instead, I have to run a file to unload BASIC segments and load PASCAL in order to use it, then reverse out the PASCAL after I am done. 

The longer term solution would be to generate many more ID segments into the RTE system. I will put that on hold for a later system generation effort.