Friday, July 31, 2015

Still battling sector read logic in my controller for the Pertec D3422 disk drive


I implemented the merged single FSM this morning and did the test of it at lunchtime. I found some errors with signals starting out at the wrong level - fixed with appropriate startup logic - and can see that my synchronizer is not working properly. I went in to ponder, research and set up diagnostic instrumentation.

Later I ran some tests and it appears my problem may be while resynchronizing for the data field, after I have successfully read the five bytes of the header field. Time to rexamine my assumptions and do some digging.

Looking at the documentation, you would believe that the preamble and the zero bits between header and data fields are both greater than 200 bits long, but when I look at the assembler listing for the controller, I see that the gap between header and data is only a fourth the size of the preamble.

This causes the problem, since I insist on 80 bits of zeros before I will check for an xFF (sync byte). There aren't enough bits of zero to satisfy that first step in the synchronizer. I will fix this with an additional input parameter, InGap, that will shorten the target count when resynching between fields.

The logic that finds the desired sector is working perfectly and I was successfully reading the five bytes of the header, but then the machine is stalling. After quite a few changes trying to find the issue and fix it, my logic no longer reads the first five bytes. Time to look over the sync and assembly functionality for possible redesign.

Thursday, July 30, 2015

Extensive diagnostics working, seek transactions fully debugged on Pertec drive controller


Testing resumed at midday on the Pertec controller logic. I don't know why I tied the counter chip reset to ground, since it resets on positive signals. I guess I thought they were inverted reset lines. Swapped the line from ground to +5 and now the timer is okay.

I did however pick up the correct chips and have carefully removed the existing chips and installed sockets, into which I will place the new 7400 and 7493 chips once I validate that I have everything properly hooked up.

To improve my instrumentation, I set up some 'registers' that I can look at from the PC, which show me the state of all the FSMs and key signals inside the machine. I keep a pair of registers for each set of signals - one is the current, instantaneous value and the other shows me if that FSM state or signal value has been reached anytime since I reset the fpga board.

It has already paid off in flagging conditions which shouldn't occur but that I might not have suspected enough to bring out on the four LEDs and four dynamic pins for investigation on the scope.

The seek transactional logic is nice and solid, but my sector read and display isn't right yet. I see the write enable signal which would store data words in the FIFO. I see the reset at the start of the read. However, it seems to stay at a count of zero. I need to be scoping the movement through various states to figure out what is going wrong.

Since it was late and testing was over, I was looking at the design to see whether I saw any errors or vulnerabilities. There are several separate state machines that interact, triggering others or waiting at key points to synchronize with another. I came up with a way to pull everything together into a single state machine, which will eliminate some risks and complications. I think it will be much more straightforward and as a result it should be more reliable.


I received some of the parts to get my clock back in operating mode. I should have the last part in three or four days.

Wednesday, July 29, 2015

This and that on 1401 and Pertec drive

I spent time at the Computer History Museum helping to work on a problem with one of the two 1401 systems - spurious validity checks when the card reader encounters a group mark character. This is quite a devious problem, We haven't gotten to the bottom of it yet, but did solve another issue where a shorted transistor pulled an output line that should swing -6 to +6 instead drop to -12V. That transistor was replaced on its SMS card but the wider validity check issue remains.


It appears the 7493 counter chip for the +5V Timer Board might be the actual bad one, since the counter is now ignoring its reset lines and is counting up to shut down the drive after 5-6 minutes. Since the reset line is hard wired to ground, it should stay in reset.

I did only a couple of short tests tonight, but will get back to testing tomorrow when I have more time.

Tuesday, July 28, 2015

fpga controller for Pertec drive now communicates with PC; diagnosed problem with 5V timer board


While I was debugging the general logic that I built to fire off a read of a target sector into the FIFO and then display it sequentially as bytes on the fpga board seven-segment displays, I decided to add in some additional logic to properly deal with the separate header and data fields in each sector.

The issue is that the drive resynchronizes between the two fields with a preamble of a couple hundred bits of zero, which means that I have to stop by byte assembly and resync to get the correct byte boundaries of the data field. Initially I just held the boundaries I established for the header field.

I know the byte count of a header field (and of a data field) - 5 and 257 respectively - following the preamble and sync byte for each. That was built into the FSM for reading so that it drops sync after storing byte five of the header, then begins again with the first payload byte of the data field.

Testing at lunchtime zoomed in on a couple of problems, which I corrected. By the end of my lunch hour I had the logic reading the five header bytes (cylinder high byte, cylinder low byte, combined platter/head/sector, CRC byte 1 and CRC byte 2).

I still have something zeroing out my FIFO so that I can't find any contents even though I received the five byte pulses that should have been stored. It may be my reset action between header and data fields, or it may be when I stop at the end of the sector. I will re-instrument and check again in the late afternoon.

I began designing a simple transactional communication method that would use the Digilent USB link and the Adept utility. With that in place, I should be able to drive this from the PC, setting up cylinder and sector targets, commanding seeks and reads, plus requesting bytes from the FIFO.

With that all prepared, I went out to the workshop and did some testing. I still have the undesirable behavior of the +5V Timer Board, powering down the drive after 6 minutes. This should only occur when the drive is idle and spun down (safe light on). I will need to do some circuit testing as this timeout is pretty annoying, introducing a whole cycle of stop and start which wastes minutes.

I also have communication between the PC and fpga board. I can command a read or seek and I see the FSMs kicking off, but they stall. While I have some defects, the ability to issue commands and check signal status via the PC is a great convenience.

I suspect either the 7400 or 7493 chip on the timer board - this is the logic that holds the decade counter in reset so that it never times out to shut off the main +5V supply. The 7400 is an extraordinarily basic (quad NAND) chip that for some reason I don't have in my pile of chips. The fact that I am missing the 7493 counter is less surprising but I do need that chip as well. My only option to grab those tonight, Frys, is a washout as they don't pretend to carry the 7400. They stock various other slightly incompatible family versions, e.g. 74ls00, 74hc00, and 74hct00.

I will drive to Anchor tomorrow morning to pick up the actual parts I need. Tonight I lifted the lead off the board from the driver (7400) chip that would hold the counter chip in reset, instead tacking a connection to ground to that circuit. I ran the drive for ten minutes without the timer shutting me down, so that makes the 7400 the suspect chip of the two, but I will pick up both as a preemptive measure.

Before I remove the bad chip, I took my remaining time tonight to debug my fpga controller logic a bit further. I could see that my read command was accepted and the main logic thought that it kicked off the sector read logic to search and match the target sector. However, something isn't correct in the sector read logic so my instrumentation moved to trace that portion of the logic.

Now with the next set of tracing, I see the read sector logic finding the proper sector number and turning on the final status where it is reading incoming bytes - however, the intermediate step which fires off the synchronizer seems not to occur - the scope wouldn't trigger on this status. 

New instrumentation is in place but it is late tonight, so I will postpone further testing until tomorrow.

Monday, July 27, 2015

Debugging read/display logic in fpga based controller for Pertec D3422 disk


The fpga board has four seven segment display characters. I use the last two to display the hexadecimal value of the current byte, while the first two record the hexadecimal address of that byte. A button allow me to start at the first character and to step through one byte at a time. I can set up a five bit address in the slide switches on the board, with that set as the sector or the current cylinder by pushing either of the two remaining buttons. This should be sufficient for my initial testing.

I received confirmation from Mike and Martin, two Altair Pertec hard disk restorers who were able to confirm that the Altair controller wrote the head value in ones-complement, just as I saw when reading the tracks.

Another change I introduced was to pulse the Start/Stop line active for only 1 us, doing this about 100 times a second, each time resetting the timer in the +5V Timer board so that it will not shut down the drive.

Ultimately, I will emit the pulsing only when the controller is executing some command. That way, when there is an extended period where the drive is really idle, the board will spin down the drive and shut off most of the power consumption.

I worked through the debugging of all the new or changed logic in the fpga, then fired up the system for the first tests in the early evening. The first problem I found was that the Start/Stop pulsing idea was a non-starter. It attempted to start the disk, but wasn't long enough to complete the latch for spinup, so then it timed out and went back to safe just in time to receive the next short pulse.

I wasn't sure what was working or not working properly, since the overall function didn't seem to give me a number of bytes to display. When I push the button to advance to the next sequential byte, the byte counter leaps to the maximum value.

After some adjustments, I fixed that issue but wasn't sure the machinery was working properly. This required setting up a number of diagnostic outputs - steady state conditions can be displayed on the 8 LEDs on the board, but four of those also had an external connection where I could hook the scope to capture pulses or short conditions. Thus, four steady state diagnostic outputs and four outputs that could be either very short/dynamic or steady.

Digging through complex mechanisms with many states, tracking key signals, and matching them to the real time disk signals - all this takes time. I exhausted the time today and have more to do tomorrow. 

Sunday, July 26, 2015

Reading headers for sector zero on all platters, heads and cylinders


My improved  logic for reading is now cleanly finding sector zero and the sync byte is clearly visible after the preamble of many zero bits. I looked into my synchronizer logic to figure out why it wasn't working properly. It should kick off the byte assembler which takes all the subsequent bits and packages them in bytes with a notification for the consuming circuit as each is ready. However, that isn't occuring.

Synchronizer byte followed by cylinder head and sector value from sector zero
The format used on this disk, which is likely the format used with Altair computers back in the dawn of personal computing, is very different from that used with the 1130. The drive interface itself is pretty similar and part of my logic will make the transition, but quite a bit will have to change.

The Altair disk has 24 sectors per rotation, with 256 bytes of data in a sector. After a preamble of a couple of hundred bits of zero to match the clocking of the recorded data, there is a sync byte (xFF) that defines which bit is number 0 in a byte, so that the stream of bits can be divided properly into bytes. The first bit encountered is the least significant digit of its byte.

In the Altair format, after the sync byte, there is a header which records the cylinder, head and sector of this record, It has a CRC checksum of two bytes. There is another preamble of a couple of hundred zeroes following the header, then the data portion of the sector begins with a sync byte of xFF. The data field consists 256 bytes of data, a two byte CRC and is followed by a couple of hundred zeroes.

The 1130 format disk has 4 sectors per rotation, with 321 words of 16 bits in a sector(642 bytes). After a preamble of a couple of hundred bits of zero to match the clocking of the recorded data, there is a sync word., Each word on the disk has its individual ECC bits at the end, thus a recorded word is 20 bits long. The sync word is 00000000000000011110 which is x'01' with its ECC. The first bit encountered in a word is the least significant.

In the 1130 format, after the sync word there are 321 of the twenty bit words that comprise the sector. No separate header field is used. The first of the words contains the relative sector number but that is purely a convention of the DMS2 operating system, whereas the controller treats all 321 words as the data field. There is no CRC since each word was protected by its own ECC.  More zeroes follow the end of the data field, continuing until the next sector pulse.

On the Pertec drive, the sector count (sector number) from the disk drive changes to 0 at roughly 2.4 us before the sector pulse leading edge. Thus, when the index pulse and sector pulse sequence completes, the drive heads are at the beginning of sector zero. The sector counter is leading the sector pulses on the Pertec drive.

On the 1130 drive, the index and sector pulses occur 180 degrees away from the read/write head location. Thus, when the index and sector pulses occur, the heads are at the beginning of sector 3, the prior sector. The sector number reported is 0. The controller has to wait until the following sector pulse, while the sector number field is just about to switch from 0 to 1, for when it begins read or write operations for sector zero. Thus, the sector number is anticipatory, used by software to issue a read or write for the upcoming sector.

With the Altair controller on the Pertec drive, the user issues a command to read or write a specific sector, with the controller doing the match to the sector number as the trigger to read or write. This is unlike the 1130, thus the sector number is not anticipatory on the Pertec. Instead, it indicates that the sector pulse trailing edge demarcates the beginning of this particular sector number.

I had the synchronizing and byte assembly working fine by the mid-afternoon. Decoding the bytes as they were read from the headers, I did see exactly what I expected from the cylinder field and the sector portion of the sector/head byte. I forced the arm to three locations - cylinder 0, cylinder 3 and cylinder 256 and found the proper value being read for each of them.

The format of the sector/header byte is that the three most significant bits encode the head, while the remaining bytes encode the sector number. I selected the four different heads in turn and looked at the head bits. They were 111, 110, 101, and 100 however according to the Altair controller documentation these are not the right values. In fact, they should be the inverse - 000, 001, 010 and 011.

Pattern of 111 (plus sector 00000) in left byte is inverse of expected 000
Setting that issue aside for more investigation - such as looking through the source code of the Altair controller board - I moved on to building out my logic for reading. I generated a FIFO buffer to store bytes as they are retrieved from disk and then fetch them for display or other use.

The fpga board has four seven segment display characters. I use the last two to display the hexadecimal value of the current byte, while the first two record the hexadecimal address of that byte. A pair of buttons allow me to start at the first character and to step through one byte at a time. This should be sufficient for my initial testing.

I started to update the control logic of the board for this read testing, removing the seek testing since everything looked perfect with that functionality. Every cylinder I went to had the correct cylinder number in the header record and read cleanly. 

I did take another (amateur quality) video of the positioner testing, now that I can loop seek patterns as well as throw in some singletons. Another video of positioner activity this time with no talking.

Saturday, July 25, 2015

Seeking okay now verifying reading functionality of Pertec D3422 drive


I began to suspect that my problem had more to do with when I deasserted the Seek Strobe, relative to the address lines changing, since the drive acts on the trailing edge of the strobe signal. I made some changes to the FSM based on that suspicion and gave it a quick test last night.

Video of drive under control of fpga testing controller - watch to see it spin up, seek out and back.

The mechanism was now seeking out and in with the half-range (out 203 cylinders and then back to home) which corresponds to a 2310 disk drive capacity. This morning I ran some more tests of different seek amounts including looping on 203 and 406 cylinder long seeks, and some very short seek loops. It all worked well - jumping between 0 and 405, small range seeks, looping seeks. I also validated that I get the Address Interlock signal (invalid seek address) with an address beyond 405.

I invested my morning in writing the initial code to handle reading from the disk. The data is encoded with a self clocking scheme similar to that used on tape drives, where gaps between sectors and fields are uniformly magnetized but all data is recorded with a clock at double the data rate. My extracter circuit signals me when each bit is available and makes the value available.

The self clocking scheme means that any pattern of data, whether the bits are zero or one, all start with a flip of the magnetic field to mark the clock and then a second flip only if the data will be a logical 1. If the data is zero, the second time interval has no flip. Each pair of intervals, the clock pulse and then the data-dependent second pulse, is called a bit cell.

Each recorded area on a disk begins with a preamble of a couple of hundred zero bits. This is a pattern with the clock pulse and the absence of the second data pulse. This allows the receiver electronics to synchronize with the pulses and know whether a pulse it sees is the clock or the data pulse.

The 200 bits allow it to get synchronized so that the electronics in the drive can split out the clock signal and the data signal. My fpga recieves a steady uniform set of pulses on the Read Clock line and during the second half of each clock cycle, if there is a pulse on the Read Data line, then we record the value as a logical 1. No pulse from Read Data in the second half of a clock cycle and we read this as a logical 0.

Following the string of zero bits is a specific pattern that is recognized by the controller, which so far is synchronized only on bit cells and clocks, but not on byte or word boundaries. The specific pattern defines which bit cell is the beginning of each byte/word of data. Following the special pattern is whatever data format was recorded, ended with an error checking Cyclical Redundancy Check of 16 bits and another specific pattern marking end of the record.

The Pertec drives have an initial set of data the record the cylinder, head and sector number of this record. It should certainly match the cylinder to which we did a seek and the platter, head and sector from which we are reading. After the CRC for the header, there is an erased gap, another 200 bits for clock synchronizing, another specified pattern for byte synchronizing, 256 data bytes, then the CRC and end character. This happens 24 times around the track.

IBM 2310 drives don't have a formally separate header. They use the same preamble of 200 bits of zero, a specific pattern for byte boundary synchronizing, but then have 321 16 bit words in a monolithic sector, capped with a CRC field and end character. By convention the first of the 321 words contains the cylinder/head/sector information, as a relative sector from the start of the pack, leaving the remaining 320 words of the sector for data. The 2310 drive only provided four sectors around a track.

I began my logic design with the logic to convert the Read Data and Read Clock inputs into a stream of serial bit values (extracter circuit). Another bit of logic recognizes the byte boundary synchronizing character (synchronizer circuit). A third bit of logic then takes a serial stream of bits and turns it into parallel bytes (assembler circuit). A higher level state machine would move from gap (idle) state to preamble to byte synch and then store data bytes in a memory on the board, but I haven't designed that yet..

When I tried to test this I quickly ran into the problem that unless I was immensely lucky, switching on the read enable before or in the midst of a preamble, I would run into a character that was neither 0x00 or 0xFF, triggering a sync error. This leads to two changes - first, I need to recover from sync errors rather than lock in the status, and second I need to synchronize the start sync operation with the beginning of a sector.

Initially, I set up the controller to look for sector zero and read that - index pulse triggers the FSM and then I enable the read electronics, begin synchronizing and let it rip. The synchronizing logic looks for a string of at least 100 zero values, then eight sequential 1 bits, after which it should be assembling bits into bytes and flagging their availability.

When I see the sector counter change from zero, I will turn off the read electronics which stops the flow of clock and data pulses. I didn't check CRC or handle the gap and data record. I only cared about the header and its confirmation that I was reading the proper sectors.

In my testing, I discovered that the +5V Timer Board feature will power down the drive after 6 minutes, whether or not I am doing seeks or reads. I needed to figure out what signal states are needed to keep this from happening. It turns out that while the normal practice is to drop the line to activate it, hold it for a millisecond or so, then return it to high, that allows the timer board to count down and shut off.

My logic now has to drop the start/stop line, keep it low, then if I push the button again to shut down, it will have to return the line to high for an interval, pulse it down, then leave it high. It adds a few stages to the FSM. After testing, I still have problems. If I hold the line low after the drive starts, it locks out the button on the front of the drive. If I leave the line high, the timer board shuts things down in six minutes. This is being punted until tomorrow, as it is lower priority than completing the read testing.

I can see my data recovery working, but something is going wrong on the path to assembled bytes. To debug, I routed several of the key signals out where I could watch for them on the scope. My extracter logic is working perfectly - when I turn on reading, the data values begins to fly across the wire.

I could see that my method of detecting the start of sector is inadequate, thus I had to improve it in order to definitively locate and start reading at sector zero. I thought about it for a while and updated the FSM accordingly.

Now I am clearly oriented to the beginning of sector zero. I see the bits being extracted, a run of zeroes followed by an all-ones bytes (xFF), which should kick off my byte assembler but it isn't. The synchronizer logic is where the fault lies, so I will ponder that a while.

My seek loop would stall after many repetitions - the fault appears to be in my testing FSM which is waiting for busy to blip up and down. I suspect that some conditions occur where the rise has already happened before I test for it. 

Thursday, July 23, 2015

FPGA controller controlling arm on Pertec disk drive, but needed change to cabling for reliability


The interface on my drive is defined by the model number as "special", which means it doesn't match the schematics in any Pertec document and more importantly, it doesn't make use of the terminator power pin on the main connector to the interface board.

The drive produces 3.3V as pullup power for the terminator resistors, but this interface board ignores that. Instead, it feeds terminator power to the interface on pins 1 and 2 of one of the two 50 conductor cables. I shall have to do the same, feeding from my fpga board.

It was time to bring up the fpga board and check out the voltage levels on the incoming signals, in addition to powering up the drive itself. I proceeded to fire up the drive and then use the fpga board to command a spin-up. The voltage checks without power produced confusing results and I had to do some troubleshooting., Turns out that the ground of the interface is also not connected to the drive ground so I had to include that wiring.

Once the terminator was pulling voltages up and I was receiving reasonable status, I began to test out the functionality. The results were good but not perfect for this first run. Successes:

  • A button push would command spin up or shut down of the drive
  • When I set up a nonzero cylinder address and hit the seek putton, it moved the arm
  • When I pushed the restore button, it brought the arm back to cylinder zero position
  • The Unit Ready signal came out as the drive came ready
  • Write Protect lights and extinguishes as I change the drive write protect switches
Failures to fix up include:
  • Write protect status appears on the wrong led compared to what I expected
  • I don't see the sector and index pulses on the LED, which makes sense given their short duration and high frequency
  • I could see that one pair of the switches for entering a cylinder address are swapped positionally
  • Emergency Unload does not happen at my button push even when held low for a long period
  • my logic to display the sector count register appears defective
  • sometimes the seek or restore buttons didn't work, other times they did
I can get erratic status led illumination if I wiggle the connectors onto the fpga board, so these are not of sufficient quality. I have female connectors on my cable and the fpga board is female connectors, so I was relying on some male-male adapters but they fit loosely in the connectors. This may work for initial checkout but I need a more reliable cabling scheme for when I begin to use the drive in earnest. 

Looking at the signals I use, versus irrelevant ones (like double density, double or quad platter because I know what I want), plus dropping the busy and select lines for drives 2 3 and 4 that I won't have connected, leaves me with exactly 39 interface signals to receive or drive. 

That is a fortuitous number, because it matches the 40 I/Os of FX2 extension board I have on the fpga board. This extension board will allow me to solder all the wiring onto the board, rather than relay on flaky connectors and adapter pins. 

I will begin rewiring the interface all to the extension board, dropping signals I will ignore, and documenting the changes so I can update the fpga logic appropriately. This is a must-do for today before I try to accomplish more testing. I also had a wire snap off one of the connectors, which is inevitable with a rickety ad hoc cabling. 

Investigating my problem areas from the first test, I found and corrected a few of the issues. Others will take more testing, particularly the state machines for my command buttons and the sector count display logic. 

I put a voltmeter on the signal for Emergency Unload, which should be pulled up to +3 but then will be yanked down to ground when I drive the fpga output pin to logic level 0. It instead only drops to a bit under 2V when my signal is asserted. This means something else is holding this line up, which I have to diagnose within the disk drive. The wiring from the schematics shows this signal hooked directly to the input of an inverter, a section of U50 on the logic board, so that is where I need to look. 

By dinner time, I had soldered in 22 of the 39 signals, those from one of the two 50 conductor cables, and went back to the workshop afterwards to finish the last 17. I will then have to update my documentation and the vhdl code to pick up the signals on these new pins. The power on test showed it was correctly set up.

The sector number display is whizzing through the numbers 00 to 23 that occur on each rotation. This validates the logic to see the index and sector pulses, plus the counter that tracks which sector is under the heads at any moment.

It appears that a Restore Initial Cylinder wouldn't work if I had a non-zero address in the cylinder address switches. On investigation, the reason is that I am violating setup times for that signal. A seek strobe is issued essentially simultaneously with setting that restore line, whereas it needs to be settled before the seek strobe occurs. I made changes to my state machines to correct this.

The problem that is still open involves the emergency unload signal, as I mentioned above. I set up the voltmeter to monitor what happens to that line when I try to drag the output pin of the fpga down to zero. I may have a bad chip or two in that path. What I see is the input to the first inverter stays at full high (3.29V) regardless of my attempts to pull it down.

It behaves as if something is doing a hard pullup of the pin on the logic board, so that my attempt to drop it simply divides the voltage across the resistors in the circuit from the fpga to the logic board inverter. This could be a failed inverter that is outputting on the input pin, but I suspect this is in my special interface board somewhere.

I exercised the positioner quit a bit tonight, moving it small and large amounts in both directions. My corrected logic for the Restore Initial Cylinder function now works - it moves the arm slowly back instead of the jump that occurs if it were a seek to cylinder zero.

Tonight and tomorrow I will change the controller to execute some programmed sequences of seeks, plus I will set up to switch between the four surfaces (two platters, two heads for each). That will be the last of the commands I can try that don't involve reading or writing on the disk.

It will also have a switch to turn on the read enable line, to see if it attempts to read data. I can set up a scope to watch what streams in from the heads. The fpga board has plenty of onboard memory capacity, so that I could read in the bitstream and begin analyzing it.

Wednesday, July 22, 2015

Matching timing to needs of the Pertec drive and cleaning up my controller logic


This morning I sorted out the problem with the Activate Emergency Unload signal, which now immediately retracts the heads and spins down the disk, as it should. I will move on and update the controller logic to accomplish several pattern of seeks to exercise the drive, plus begin watching to see if the drive will attempt to read data from the drive.

At lunchtime, I whipped up the new controller logic and began testing. The two looping patterns keep the drive busy in seek lamp partly illuminated so I know that the commands are issued rapidly. What I don't see is the motion, so I have a flaw in my state machine or setup of the address lines.

After a short look at the Pertec documents I realized I am not meeting all the setup and stability requirements, which are a minimum of .5 us and sometimes 1 us. With an fpga that ticks along at
.02 us clock rate, I need to wait 35 to 50 cycles whenever I change lines. Another issue is the lagginess of status signals. Specifically, when you command a seek or restore, the drive may take up to 1 us before the busy signal goes on, thus my state machine thought that the drive had already finished. In fact, the shortest time even with zero movement or an addressing error is 2 us, 100 cycles.

I modified my state machines to stick in those wait durations before lunch was over and went out to test some more. First shot, didn't work. This evening I stuck in some diagnostic indications and retest (as well as looked over my logic for flaws). Even with greatly extended times, the FSM wasn't working. More seriously, once I had every valid state of the FSM lighting an indicator, I saw that it would often get lost in some invalid state - not covered by 'others' or any valid state.

I have to figure out why I have this flaky FSM behavior as that seems to be the likely source of the problem, not my drive nor the intent of my logic. This will take some study, I guess I can work out the logic to read in and decode the header section of the first sector, while I mull on the main problem.

Hooked up fpga controller to Pertec D3422 drive for testing

Today I was extremely busy with other activities, such as my job, but I was able to get in to the workshop at 5PM and get a bit done.


I set up the testing area to verify voltages on the connectors before I hook them into my fpga board - fired up the disk drive and then probed the lines. No high voltages were present and no short circuits, so it was safe to hook the connectors to the Nexys2 fpga board and begin the next round of testing.

Connecting the 48 small connectors to the fpga board took time to get right, but by the early evening I was ready to put in some preliminary vhdl to the board and fire it up. The intent was to watch the status lights and seven segment display to see if the status appears correct.

It became obvious that the termination resistors weren't connected to +5V inside the disk drive, thus not pulled up as they should. I know which connectors on the interface card carry these voltages, so time to trace out what I need to do in order to fire up the termination properly.

I haven't finished all of the logic I wanted, but put enough in to try a few commands. Specifically, I could start or stop the drive, command it to do an emergency unload of the heads, and ask it to seek to a given cylinder or the home position. Once the terminator pullup power is in place, I can give it a try.

First connection of my controller built to drive the Pertec drive


I completed assignment of the input and output signals of the Pertec interface to specific pins on my Nexys2 fpga board. Now wiring the cables to connect the two. In addition to the cabling, I have to whip up some logic in the fpga board to drive the Pertec to test out various functions. My first set of tests focus on basic and arm movement functions.

I spent my lunch hour coding VHDL and began wiring the cables. The work is very slow and tedious, with lots of cross checking including using a continuity checker. It took about two hours to complete the first of two 50 conductor ribbon cables.

Only some of the conductors are connected to the fpga and these skip past ground conductors in a random looking pattern. The FPGA board has multiple 2x6 sockets around the periphery but four of the 12 pins are for ground or power, thus I can connect eight signals to each of these 2x6 positions. I am using 1x2 connectors to hook to the fpga board 2x6, thus four separate small connectors plug into each 2x6 position. Checking, stripping, soldering, crimping, insulating and checking again - finally I had finished the first 50 conductor cable breakout to twelve of the 1x2 connections into the fpga board.

The second 50 conductor cable is about the same number of connections, so wiring it up will consume the late afternoon/early evening. I intend to check all of the resulting connectors for shorts between the two signals that are side by side, plus verify that none are connected to ground, doing this for the all cables on the interface. The incoming signals in the second cable will be checked for shorts to ground and for high voltages, before I connect them to the fpga board sockets.

I may have enough VHDL complete to allow the board to display status of various types, such as unit ready, lack of busy, index and sector pulses plus the sector counter all whirring faster than the eye can see, and the absence of any detected malfunctions. Thus, if all the wiring and testing is done by tonight I can fire the system up for a first look, but I am expecting this will be deferred until tomorrow as I run out of free time.

As daylight faded, I had finished the wiring and testing of my connections but hadn't powered up the drive to check for high voltages. In order to do this, I need a mount for the fpga board that will allow me to open the drive and observe its behavior. Late tonight I set up a stand and prepared to do the voltage tests on Wednesday.

Monday, July 20, 2015

Pertec drive solenoid working, spins up to Ready status, now I build test controller to validate operation


The new fuseholder is working fine and I have normal power supplied to the logic boards. I went back to troubleshooting the solenoid lock behavior and safe status signal to the +5V Timer board. Exactly as I suspected the power transistor Q42 was blown. A roundtrip drive to Anchor Electronics netted me a few of those components.

After replacing the transistor, the solenoid snapped open and closed as it should. As well, the +5V timer board did its thing - after roughly six minutes of inactivity, the relay snapped closed and the lamps extinguished on the drive (other than the main on/off switch light which is powered by the nonswitched 5V from the timer board. A push of the start/stop switch turns the power back on.

I then did another spin-up and let the heads load, just to see whether the logic would advance to Ready status now. Initially it didn't, but it was a quick fix to get the drive to go ready each time it was started up. The status and sequence logic appears to be working properly.

My expectation is that most components are working correctly, which warrants further work on the drive. I dressed all the cables with cable ties to neaten up everything in preparation for the next phase of testing. I have a six foot work table that I want to place the drive on, but due to its high weight I have to wait until I have a few friends here to lend a hand.

Now to whip up an interface to a spare Nexys2 FPGA board and use it as a controller to test the servo operation. Once I feel comfortable that the arm is moving properly and under interface control, it will be time to begin reading from the pack.

The terminology for the interface is that a true condition is a low logic level, 0V, while a false condition is high level (+3V). This can be thought of as inverted logic, but some usage can be confusing. For example, where the documentation mentions that some events occur at the trailing edge of a signal, that means the positive going edge since the pulse is an inverted signal dropping to zero for a short period.

Pertec is mostly consistent with this approach, except for the data and clock signals that are emitted during reading. In the case of clock pulses, the time to examine the data signal is while the clock is high, not low. A clock begins with a drop to zero, beginning a bit cell, then when it swings back to high is the time when the data value is read - not just at the 'trailing edge' of the clock but during the interval while it is high.

I developed a plan for the use of the 8 slide switches, four pushbuttons, eight LEDs and four 7-segment display digits, and blocked out the basic timing I needed for the interface. I haven't yet allocated the specific IO pins and connections I will need, but I expect to work on this tonight and tomorrow.

Sunday, July 19, 2015

Pertec ddrive issue appears to be simply a bad fuseholder


The +5V Timer board, which was undocumented, was one area I needed to investigate during my debugging, so I traced out the circuit, drew a freehand schematic and labeled the board layout. I also made it available to others who may be working on Pertec disk drives with this optional feature.

Mostly hand-drawn schematic of timer board in Pertec drive
Board with components labeled
I removed the defective fuse holder and prepared the power supply for a replacement part. The fuse wasn't blown, it just wouldn't maintain a closed circuit so that the -20V raw supply never made it to the rest of the circuitry. I am feeling better about the condition of the Pertec drive as a result.


I moved a few items into the datacenter shed to make room in my main workspace. I need to elevate the Pertec drive on a stand where I can put in the intensive time and gain access to all sides easily. 

Done with DOS/VS, digging through Pertec problems, and repairing Nixie/Dekatron clock


I finished up my DOS sysgen tasks to my satisfaction. I have the listings of the libraries and other contents of the fully functional DOS/VS 34 system, having done various tasks such as deleting unneeded programs from libraries. Now I can move back to VM/370 on the P390, abandoning the Hercules simulation on Windows and running on a 390 processor chip.

Listing of the disk pack holding my generated DOS/VS system

I discovered that my -20V raw supply had blown its fuse again, which was the reason that the solenoid lock wasn't getting full power. Surprising that the logic seemed fine with this, lighting the safe light and spinning up the system, but in any case this isn't good. I am a bit concerned that I might have a high number of failed components, but will soldier on working problems as I found them. At least, for a while.

There is circuitry to verify that all regulated power supplies are operating at normal levels, which I would expect to be a show-stopper for trying to spin up and load heads. This is going to require some probing and voltage validation before I next try to load heads on the drive.

The -20V supply fuse holder itself is not in good shape, which is contributing to the problems. I have another on order and will swap it out. When I replaced the fuse and powered up, I saw a blip on the -10V regulated line before it went to zero (from a newly blown fuse). I thought I had been back to basics and had a solid power supply situation, but that changed in the last day or two.


I found the high voltage supply section was not working - I pulled the two small and one power transistors and found that while the power transistor survived, the two smaller ones had failed. Replacements are on their way in a few days.

Friday, July 17, 2015

P390 continued DOS sysgen, Pertec disk drive restoration continues


I found a manual from a newer version of DOS (VSE) online which mentioned the error message 0I68A that I am receiving when my new supervisor fails to IPL. Based on that manual, the error code is telling me that I over-allocated the real storage of the Hercules simulated processor, putting too much of it into the real area thus not enough is there for the supervisor to complete IPL.

I had a bit of time at lunch and reassembled the supervisor, then booted it up to see how I did. I had a successful IPL of my system, without any of the customizing ADD and DEL commands previously required - simple IPL commands of SET and DPD to start up a DOS system.

Next up, I will play with the standard assignments, labels and procedures to get this where I want, generate my own Power/VS spooling system, then ensure it all comes up properly and works as intended. I started writing the JCL, macros and utility commands I needed tonight, to be as prepared as I could for when I 'powered on' the mainframe to test. That is the approach used with real systems of that era.


I had to replace some foam that sealed up the area where the heads enter the disk cartridge, since the original foam had become crumbling dust, exactly the opposite of what you want anywhere near the ultra thin spacing of the heads over a spinning disk. I had some suitable supply on hand and installed it today.

Since the timer board is not doing its job, I needed to have a schematic but none is available. Fortunately, it is a low density PCB with big visible traces that are easily followed. I whipped up a circuit diagram in the remainder of my lunch hour and in the late afternoon. Once done, I shared it back for those who might also need this documentation.

I started tracing a few voltages to figure out what was happening with the timer board function, as well as the solenoid lock and ready status. I see that the solenoid driver is behaving as I would expect it to, if transistor Q42 were an open circuit. It is a two transistor cascade where the first (Q43) pulls a small current through the solenoid that biases the higher power transistor Q42 on which increases the current.

My symptom is that the solenoid barely moves or has to be pushed. Measuring the voltage drop and absolute levels  shows me that the solenoid is getting about 5V across it, but should be dropping 25. The circuit values would cause this if Q42 didn't conduct and only the first stage was acting on the solenoid.  I will pull it out of the circuit and check it out.

The timer board senses the voltage from the solenoid to determine that it is in safe mode and should start the timer for the 5.5 minute power-down. Since it doesn't see the condition, its behavior is understandable.

I ran out of time tonight but my next steps will be to step through parts of the circuit, validating the expected voltages or signal states. I might have several other parts that failed due to the power anomaly that released the magic white smoke.


A number of people with 026 and 029 keypunches were looking for a source for the v-belt that is used inside the mechanism. I used one of the NOS belts from CHM to measure and study the part, then located some alternatives that should be close enough. Two are on the way to me for testing.

Thursday, July 16, 2015

Pertec drive loading heads on disks successfully, DOS/VS sysgen proceeding

Work was quite heavy and kept me from the workshop for all of yesterday but I wrestled some time free to play around at lunch time and in the evening.


I prepared my supervisor assembly deck today, selecting the options and configuration I wanted. I wrapped it in JCL and brought it over to submit to my DOS/VS system under Hercules. After discovering one tiny typo - an extra space on an assembler statement which pushed the statement continuation character out of column 72. Mainframe systems were often highly sensitive to specific positioning - things had to start in a given column number or the code wouldn't assemble or compile.

I then finished my supervisor assembly, linked it into the core image library,and did an IPL to test out the new supervisor. Unfortunately, when I entered the supervisor name, the IPL failed with a compatibility error 00. Whatever that is. Time to dig through the various manuals to figure this out.
I also looked through the 'printer' output when I assembled and linked the supervisor.


A fellow vintage computing enthusiast has been helping me with documentation and advice on the Pertec drive. I had been a bit hampered by the existence of a board installed over the voice coil arm positioner, labeled as a timer board, that had zero schematics and only one cryptic mention in any of the schematics or other manuals I previously had. I know have information on this optional feature which will help me further in debugging and restoring my drive.

The main purpose of the board is to save energy. It monitors a drive for when it is idle without the cartridge spinning, turning off the +5V regulated power if the machine has been in that condition for 330 seconds. I never saw that behavior because I didn't leave the drive powered up for any longer than I needed for various tests.

My schematics for the drive seem to all reflect the changes that are made to connect to the +5V timer board, but no schematic of the board exists. At least I know how it was hooked in. Based on this, I could begin diagnosing my current issues. I made a list of points to monitor on the logic board, helping me figure out which condition is keeping the drive from lighting the 'safe' lamp.

After about a half hour of work, I now have the 'safe' light illuminating and the lock solenoid releasing. The unit seems pretty well healed now, so it is time to turn my attention to final cleaning of the platters and heads in preparation for first contact sometime in the coming days.

I cleaned the active area of the top of the fixed platter. I still need to get access through the bottom of the machine to do the same cleaning of the bottom of that fixed platter. Next would be the outer edge of the top and bottom since the heads have to slide through here to load.

When the fixed platter is done, the disk cartridge will need a good cleaning of its platter, top and bottom. Finally, the four disk heads must be cleaned and inspected to be sure they are fit to lower onto the platter surface rotating at 2400 rpm since they have to fly on an air cushion much smaller than a dust particle.

A prefilter and a HEPA-style absolute filter are ensuring that the air blown into the disk cartridge and fixed area are free of particles big enough to crash between the head and disk surfaces. I will vacuum them out and reinstall them. Once all these cleaning steps are complete, it will be time to get the drive to load up heads onto the disk platters.

I decided to fire up the drive and test the function that will shut off the +5V power after the drive is idle ("safe" lamp on) for more than 5 1/2 minutes. It did not turn off the power. This needs investigation.

I also have poor behavior of the lock solenoid. It should unlock as soon as the 'safe' light turns on, then lock when the drive starts a "run" cycle or when power is switched off. The light is on, but I have to wiggle the solenoid to get it to snap unlocked. Similarly, it is not very reliable in locking when the drive begins spinning.

I have the emergency unload bypass jumper installed, which ensures that the emergency unlock doesn't kick in and try to pull the disk arm back to its rest position. I also unplugged the voice coil itself, so that the arm won't try to move out and load the heads onto the disk platters.

I can verify that the drive spins up, then increases speed for thirty seconds to purge the disk of any residual dirt or dust particles, but at this point it wants to load the heads which I have blocked by my removal of the J205 cable to the voice coil positioner.

Once I had opened the fixed platter section, carefully cleaned the top and bottom of the platter, reassembled that section, I cleaned the platter in my disk cartridge. Last stop were the heads, which were cleaned and inspected.

I wanted to load the heads by hand - meaning get the platter spinning and then push the positioner out to put the heads on the disk and slide them along. I held the safety switch with no cartridge in the drive, so that I would only be loading the lower set of heads onto the fixed platter. It worked well, no problems and no marks.

Since the fixed platter loading was smooth, I inserted the cartridge and hand loaded both pairs of heads onto the two platters - just one positioner moves the heads to the same cylinder for both platters. That went smoothly.

 After everything looked good, I removed the emergency unload bypass jumper, plugged in the voice coil, and prepared to start it up. The drive sequenced itself up, spinning at 2400 RPM and then faster for 30 seconds in a 'purge' cycle which blows any residual dust or particles off the disk. Then, the position shot out, loading the heads on the platters. Very nice indeed!

The 'ready' light did not go on, but the heads remained flying at their loaded position, which means the drive didn't detect any error that warranted a retraction and spin-down. Since I don't have anything hooked to the interface of the drive, there could be a signal or two that are needed before the drive reports itself  'ready'.

It is time for me to re-install cable ties to keep the cabling neat and out of the way, since I had to cut apart the old ties when I was removing boards and heat sinks during my power supply repair project.
Debugging from here will involve some status checking using probes, then I will set up an FPGA to act as the controller and computer, allowing me to exercise the servo positioning, test the various commands, and get myself ready for the reading and writing testing that will come later.


I had built a clock, based on someones kit, that would display the hour, minute and second as decimal digits using Nixie tubes, flash the seconds with neon lamps shaped as colons between the three pairs of Nixie tubes, and with a Dekatron tube showing a rotating pendulum. When I completed it, I used some old Nixie tubes I had bought on ebay years ago, but there seemed to be problems with those.

I just received my shipment of NOS Nixie tubes made in the USSR, which I plugged and and fired up the clock. I did see the digits lighting as they should for a short time, but as I had the Dekatron cable plugged in reverse, bad things seemed to happen. The project went dark and the high voltage has disappeared (which is essential to light the Nixies and neon lights).

I will have to dig out the schematic, debug the board to figure out what damage I did and repair it so that I can get my retro technology clock working. The project is a filler activity, so I may wait a while before I get back to it.

Tuesday, July 14, 2015

Checking out Pertec startup functionality plus achieved DOS/VS generation (but using Hercules)


I took my download distribution tape files over to a home PC where I had installed Hercules, and prepared to do my sysgen here instead of on the P390 (as the P390 won't support 2K pages which rules out ever running DOS/VS 34 on that machine).

Brought it up, IPLed the distribution tape and wrestled with Hetrcules for a while, until I could get it to IPL the tape and then read the virtual card file with my JCL. The distribution tape initialized the SYSRES disk volume and then restored the contents of the 'tape', using the default allocation for the four libraries (core image, relocatable, source and procedure).

It then was time to boot the new DOS pack and continue the sysgen. This was where I had hit the brick wall with the P390, but it all went swimmingly at this point. I used a default supervisor, modified the device configuration with ADD and DEL commands, then brought the system up. I formatted various first-time files and brought it all the way up to a normal running condition with Power doing the spooling.

The system is annoyingly fast - far far faster than on a real mainframe or on the P390. Restoring an entire tape to a 3350 volume took a couple of seconds, for example, and formatting all the power queues took mere seconds.

At this point, I have to work through the Sysgen manual and distribution tape cover letter to decide what supervisor and what other options I want on my system. This will involve running various jobs, assembling the new supervisor, and some other tailoring; for example, I will update the standard labels to match the way I expect my system to work. It won't be much work to prepare and then the actual execution will be almost faster than I can click the submission buttons, given the speed at which the simulated mainframe operates.


Power on testing with the two crowbar SCRs installed went fine. I then put the raw power supply back in the disk drive chassis, put the servo board back as well, and cabled everything up. I was quite careful to make sure everything was wired and buttoned up properly.

I took a bit of time to install cable ties to neaten up the wiring and keep things safely out of the way of both moving parts and swing-out sections for servicing. When I was certain that it was put together properly, it was time to apply power.

It came on, with the regulated power holding steady at appropriate voltages. When I pushed the disk start button, I heard the motor start to spin up. I cancelled that fast, since I am nowhere near ready to have the heads load.

The positioner is still sitting with the emergency unload relay inactive, and the solenoid isn't released to let me remove or insert a disk pack. The arm position detecting lamps were lit and plenty seemed good, but not with an unlocked solenoid. I may have a few more op amps or other parts that are bad, but it will be much easier to debug this now that it can hold power and let me monitor various logical states.

The 'safe' lamp is not illuminated when I power up, which is a necessary condition in order to release the lock solenoid. I need to fine the reason that the safe status is not achieved. I am surprised that the logic will spin up the drive if the safe lamp isn't lit, but it is doing so at this time. The lack of a 'safe' status is the first issue I will shoot.

Also, I see the emergency unload relay is not energized- whether that should happen at power on or only after the drive gets up to speed, that is the uncertainty. Still, I will check things around this to see if I spot any defective parts.


I still don't see any way that the connector pins I hooked to my SAC Box are related in any way to the fpga configuration signals. I am a bit wary of just hooking up the known good board, lest it too break due to some unforeseen interaction with my interface boards. However, I have to move forward at some point. The risk is destroying another $100 board if this problem stems from an own-goal, but I am about at the limit of what study can be done prior to live fire testing.

I am going to order the replacement board from ztex but with the stipulation that they have programmed the board with their 'light show" demo so that the board comes up running that code when powered on. That assures the power-on self configuration was working when they ship it out, and will be the first test I perform when I receive the unit. 

Monday, July 13, 2015

Will never be able to run DOS/VS on P390, put crowbar circuits back into Pertec and worked on ztex board issue






A fellow enthusiast has gently steered me to the realization that the basic design of DOS/VS is oriented to 2K pages, but the P390 only supports 4K page tables and protect keys. Thus, there is no way that DOS/VS is going to run on the bare metal P390. Further, since CP/370 will use the same page table scheme as the guest, putting DOS/VS in a virtual machine will not resolve the 2K page table problem.

In fact, I was seeing CP itself crash, which I believe is due to the 2K page tables it attempts for the DOS VM. Thank you, Glen, for helping me see this. While later on, as VSE, the operating system evolved to use 4K pages (and ESA mode), that software is a licensed product that I have no entitlement to use nor access to download.

To complete my itch to do a ground up sysgen, I will need to take the DOS/VS work over to Hercules on a PC, abandoning the P390. At least I can wrap up the sysgen fairly quickly and satisfy the nostalgia, leaving the P390 for playing with VM and MVS kinds of code.


I installed two of the three crowbar SCRs today. Part of the challenge is that the board has traces on both sides, but the process of desoldering removed the scanty copper lining the holes - thus I need to have the lead soldered on both the top and bottom in some cases. This is easy with long leaded components like resistors, but not so good for a transistor that bolts down to the board. Two of its leads are hidden under the transistor.

My solution was to use solder paste - a paste that has small beads of solder, intended to be melted in a  reflow oven. I put that on the top trace and into the hole, then pushed the transistor leads through. After bolting it down, I had solder paste in contact with the bottom and top traces plus the lead, so a bit of traditional solder applied to the bottom causes both sides to flow.

I verified that the hidden top traces were in contact, and that no unintentional solder bridges were formed, by using a VOM as a final quality check. The third SCR was bad and the replacement part I originally ordered on eBay was also defective when I tested it. I have a NOS unit coming from a more reliable supplier and I will place that on the board once it arrives.


Both Richard and I were digging through the schematics and other documentation to find any way that I might have affected one of the signal lines that is needed to have the fpga configure itself from flash on startup. Neither of us can see those lines routed to the IO connectors where I attached.

I wanted to check before I took the board from Richard which is working well now and hook it up to the SAC box, just in case something in my wiring is damaging the ztex board.

Sunday, July 12, 2015

Pertec power problems appear fixed, diagnosis of ztex fpga board issue and P390 DOS/VS work done


I hunted for prebuild DOS/VS packs I could use on the P390 under VM. The first system I found, consisting of two packs, has the exact same error when I IPL - takes the supervisor name then goes into a loop.

My VM system is set up for a five pack DOS/VS system, which matches lots of posts and comments online but I had to hunt quite a while to find an active site where I can download these five almost-mythical packs. I found them (in compressed CKD format), converted them to the AWSCKD format and burned them onto CDs for transfer to the P390 system.

It is a slow process to transfer them over, but I put in the time tonight in order to attempt to bring up the five pack DOS/VS system and begin debugging my from-scratch sysgen process. I followed all the directions exactly, but each time that I 'dial in' to the DOS/VS machine to boot it up, VM restarts because something bad happened.

This is quite frustrating - doing a direct IPL of the pack gets the same tight loop in the supervisor as I experienced with the other pre-built system and that I see with my restored pack from the distribution tape. At this point, I suspect I will need to move over to a PC and use Hercules to fire up the premade DOS system, look up the supervisor code and then debug on the P390.


I didn't like the way that the parts tester reported on one of the two darlington transistor pairs - these are a pair of transistors, a couple of resistors and a diode all stuffed in a single TO-3 transistor can. Thus the three terminals, E, B and C, are composite devices and not single junctions or simple devices that the automated tester can understand.

The cost of the parts is low so I picked up replacements, plus a few more of the LM741 op amps that I keep frying. I also worked out a testing regime that should pinpoint the problem with the positioner power without having to burn up op amps or blow fuses. I removed the two E terminals again, so that the rest of the circuitry behaves well, and then tested the positioner circuitry.

I looked at what voltage is present on the driving op amp, its +, - and output levels, to divide the investigation into the logic that sets the voltage versus the circuitry that amplifies the voltage and powers the voice coil to move the arm.

Since I was suspicious about one of the darlington pairs, that is the place I expected to see a problem. Specifically, I expected the upper pair that passes the +20V to be essentially a short circuit, delivering +20V (minus voltage drops in the circuit), while the bottom transistor pair should be near ground.

Based on the wacky results with the component checker, and the perfect results of the replacement darlington pairs, I replaced the parts on the heat sink. They were clearly bad, one short circuited and the other with a damaged junction or junctions inside. I don't want to think of the possibility that the voice coil itself was damaged, as that is a non-replaceable part which renders the entire drive into junk or a spare parts repository.

First,  before I put power to these transistors and possibly fried them again, I did the check on the op amp with the transistors out of circuit, since the op amp tells the power section what voltages to deliver. If the op amp is swung towards the positive rail (+10V), the power section is legitimately driving towards its most positive level.

If the op amp was not putting out nearly 10V, I have a problem in the power section. If the op amp was driving high, I would have to investigate the cause upstream of the amp, starting with the values on the + and - inputs.

The testing quickly showed that the op amp was driving out about -3.6V but with its inputs of approx 0.002 and 0.003 so not sure this is as it should be. With the new transistors out of the circuit, this might be how it would act. I connected the darlington pairs back into the circuit and fired everything up.

Everything held - good solid regulated power at +10V, -10V, +5V and -5V. The positioner amp output was running at almost -20V, which is the emergency unload level that protects the heads by pulling the arms out of the pack when things are not safe. The emergency unload relay doesn't activate until several 'safe state' signals arrive at the transistor that energizes the coil.

Those signals will be produced on other boards, but those boards are not connected to this one currently since I am still debugging everything. I am feeling more optimistic now that the power seems to hold without smoke, fuse pops or other problems. Time to re-install my SCR crowbar protection components then move forward on my restoration.


Richard Stofer generously offered to meet me, bringing his setup, so that we could get to the bottom of the problem I am having storing my fpga code in the ztex board in a way that will autoconfigure the fpga at power-on.

We met at noon in a Starbucks about equidistant from our homes, since we are almost two hours apart. We both have the same boards and our own toolkits on laptop or tablet machines, which made a set of substitutions the right tests to perform first.

My toolchain (Xilinx etc) was able to load the flash memory on Richard's board and it booted up fine when power was cycled. My board continued to fail. Richard used his toolchain to write to my board's flash, which reported success as it always does but in fact my board didn't boot.

We ran the flashbench demo which writes to and then reads from flash, verifying that my flash is functional. We loaded the fpga just fine dynamically, but nothing would make my board load the fpga at power up. We have to conclude that my board is somehow broken.

I chose to do some checking to see if there is any way that my circuitry in the SAC box might be able to damage an fpga or USB chip pin that is needed for power-on autoconfigure but not used to separately load the fpga or access the flash. I had to study the schematics and other documents to learn what IO pins are used for power-on configuration, then see whether (and how) I might be using them.

Richard gave me his board to speed up my resumption of SAC Box debugging, since it will take some weeks roundtrip to get a replacement board from Ztex in Germany. Once the replacement comes in, I will return it to Richard. 

Saturday, July 11, 2015

Pertec power issues - two steps forward, one step back, plus toying with P390


I attempted to set Control Register 0 bit 7 as this often helps with older OS running on the P390, but any of the available supervisors remain in a loop as soon as I load the CR. I don't have the source code or listing of the supervisor, nor any definition of what is configured in the six IBM supplied supervisors. If I had those, I could do address stops, single step and other debugging of the supervisor to figure out the specific cause of the loop and correct it.

My goal is to build a DOS/VS system from scratch, armed only with the IBM documentation and the distribution tape, just as I did it 42 or 43 years ago. There are prebuilt images of DOS/VS that I could fire up, having been uploaded by Hercules 390 emulator users. I have avoided this because it thwarts my objective of a ground up system generation. I may have to use the packs solely to get past this IPL loop problem, giving me the data to debug and the ability to produce and store a supervisor with the appropriate fixes.


Still chasing the cause of the high drain that blows the +20V fuse. Pertec has a couple of versions of the servo board. One has plugs on the board that can be pulled to isolate the regulators from all the other loads that use the power. It would have been very handy since my problems are likely caused by the consumers of the raw +20/-20 power.

I decided to isolate where I could - there are two darlington pair transistors on one of the heat sinks that are connected when I connect the +10V power transistor that shares the same real estate and cable. With those two transistors disconnected - they are the power to drive the positioner to move the arm in and out - I could bring up the power supply section without any fuse blowing.

Now, I traced voltages and zeroed in on failed parts. The +5V supply was running up at over 11V - not good at all. It would have fired the crowbar and blown the fuse, if I hadn't temporarily removed the crowbar SCRs. When I looked at the op amp that controls the voltage, I saw that the + input was the proper reference level of 5V while the - input was over 11V. This should have driven the op amp towards its negative rail (ground), but the op amp was pumping out the 11V into the power transistors.

I removed the bad op amp, put in a socket to facilitate further swaps if needed, inserted a new op amp and fired up the servo board. Now, everything was great. +10V, -10V, +5V and -5V were all correct and clean. Time to reconnect the two transistors that drive the positioner.

With them back in the circuit, I monitored the voltage sent to the positioner. When I turned on power, I saw +18V to the positioner, which swings between +20 and -20 (actually about 18 to -18) and the newly installed op amp for the 5V signal smoked as it died. Weeeeeelllllllll. Isn't that special!

Not so bad, because I have a socket there now and can quickly replace the +5V op amp. However, this flaw looks like the op amp for the positioner is generating an output at the top rail (+20). I still don't know why it would burn out the +5 supply, since the circuit mainly is connected to raw +20 and -20.

It tells me other components on the board are bad. I have read the adage that when the 'magic smoke' is released from an old system, things are quite dire. I never believed it, because I thought that if the failed parts are found and replaced, the damage being localized, the system could be repaired. In this case, however, it does seem that there are quite a few parts that died when my incident occurred.

Friday, July 10, 2015

P390 playing with DOS, Pertec power supply diagnosis and repair work


When I left off testing, I had formatted the DOS pack using the standalone ICKDSF utility, settting it as a minidisk, but then the DOS standalone restore utility quit with a complaint about the disk. I could download and read the source code for these utilities - once I figure out which of nine optional materials tape images contain the code.

I would have to bring up an existing DOS image in order to run the restore of the tape, so there is some work I need just to do this. It turns out that if there is a premade DOS image I can bring up under VM, it would let me do a restore of the DOS pack I wanted without needing the standalone utilities. That is, I could run the restore utility in a premade DOS image, using it to restore the contents of the tape to my target DOS pack. Then, I would have DOS ready to IPL to continue my ground-up install of DOS.

In the interim, I pulled down a slightly out of data version of the DOS/VS Messages manual to see exactly what it listed as the reason for my message to be issued. That didn't help, but buried in the sysgen manual is one small phrase in a sidebar - the restore program ignores what device type you specify and instead determines it by reading the Format 4 DSCB from the disk volume.

Aha! This is back to the same old problem that I can't get the standalone initialization program to work. My workaround using ICKDFS to do a DOSVTOC as it calls it obviously omitted setting the critical device type data in the Format 4 DSCB. Vexation!

I gave a try to build a virtual 3350 that is 560 cylinders, thus containing the space that will be used by the DOS initialization program to format the alternate tracks. I was  hoping that the configuration program for the P390 wouldn't impose a 555 cylinder limit on 3350 disk types. Fortunately, it didn't.

The initialization program ran perfectly, as did the restore job. Doh! At least I have the system ready to begin my system generation steps. It may have taken a few extra days to get going, but I am pleased that I am now moving forward.

What I need to do is IPL the newly restored pack, do some special one time commands at IPL and right afterwards, then the install is done. The IPL and 'enter' on the virtual console gives me the first IPL message requesting the supervisor to load. The distribution tape comes with several IBM supplied supervisors with various features and device configurations.

However, the details on those supervisors is in attachment 2 to the cover letter, but whoever saved the DOS VS 34 cover letter omitted the attachments! I don't know anything about the supervisors $$A$SUP0 to $$A$SUP5 which is a problem.

I can overcome the device address issues by using the IPL time commands DEL and ADD which let me manipulate the devices at various addresses. However, some details in the supervisor may be required to run - e.g. it may need to have been assembled with 3350 device support. Yet, I can't tell what the supervisors contain since I have no documentation.

Once the supervisor name is entered, the P390 sits in a loop, running forever and ignoring the enter on the console that would allow me to start issuing ADD, DEL and other commands. Hmmm. This will take more research. For example, is the supervisor attempting to use 2K storage protect keys, which are a no-no on the P390? Is it lack of 3350 support? Is there some other issue causing the supervisor to fail or loop?


The darned crowbar protective circuit is forcing the fuse to blow so fast I can't sort out what is wrong in the power itself. I suspect a diode is bad but it is hard to test everything in circuit due to the interaction with other components.

I see a way to use my lab power supply to inject +10 and -10 to the circuit in order to see how the +5V and -5V regulators deal with that. The good news, with the two external heatsinks disconnected, I got the right reference voltages to the op amps. I removed all the crowbar scrs and began shooting the issue.

With the heat sinks connected (power transistors producing the regulated +10, -10 and +5 supplies), I still get a fuse popping in the -20V raw supply. The +10V level is fine, but the -10V drops to about 1 V because its =20V input supply fuse is blown. I didn't attempt to look at the +5 level, because with the fuse blown, all bets are off.

I decided I should follow the raw -20V line a bit, since it is fed to other circuits that don't need regulated power. I found that the line goes to several places on the servo board, which all have to be checked for defective parts or shorts, as that could be where my problem is arising.

I was hoping to find that most of this went off-board and was therefore isolated since I had most cables disconnected, but alas I see enough connections and components on this board that I know this will be a slow, time consuming search.

I cleaned things up so that if I supply only the -20V raw power (not the +10 and +20), I get good regulated voltages of -10V. If I supply the +20V raw only, I seem to get the right +10V output. Something goes awry when all three raw voltages are present. I see one of the fuses blow but before it happens there is a period where I can verify that both +5 and -5 outputs are good.

Thursday, July 9, 2015

Back and staring on the disk drive debugging

Just got back tonight from my travels this week. Now to play with the 1130 and related toys again.


A fellow enthusiast who has past experience restoring and working with this disk drive has shared some materials and tips with me! I am just digesting what I found right now, after which I will likely have a few more questions. All that must wait until I have the power supply repaired.

For the power supply issue I am debugging, as far as I can see from the schematic there is just one path for the +20V raw power to get down to the circuitry for the -20V raw input, where parts are committing suicide. Everything else is separated by ground in the middle. Thus, unless some failed component has created a sneak path down to the op amp and transistors that are frying themselves, it can only be a small number of parts causing the problem.

The transistor replacements arrived today and I soldered the part in to replace the burned out Q30 in the power supply. I put a replacement op amp into the socket I had installed, to allow quicker replacement until I nail the core problem.

Monday, July 6, 2015

Holiday interruption but some progress on P390 and activity on Pertec

With the big holiday weekend here in the US, I have my daughter's family and their friends over for a belated barbecue which chewed up quite a bit of the day with prep work, shopping and cleanup, in addition to the time together. I didn't get anything done on Sunday.

I am off tomorrow on a business trip thus progress will be quite limited all week.


I was successful using the standalone version of DMKFMT to format the DOS sysres pack, after which I began trying to run the standalone restore program to get the DOS image from virtual tape to the disk.

I had to modify the JCL to run the DOS install, forcing it to skip the initialization step and go right to the restore step. It worked, except for a minor JCL error I had to correct. That is, it worked until it tried to do IO to the virtual disk drive, then it once again failed with the 'invalid drive' error.

I did receive the new CMOS Battery for the server and installed it, now that the post office finally delivered it, several days late compared to when a first class package should have arrived. For several days it was listed as late, but then when it tagged in at my home post office this morning, it was magically adjusted back to 'on time' status. The stats will look really good to top management at the USPS, if they routinely erase evidence of failed service.

Next, I booted the ICKDSF standalone utility tape provided with the P390, which should definitely be able to initialize emulated 3350 packs. This too failed. The good news, however, is that it gave me more information about what is going wrong. It is an unrecoverable error on the alternate track 022B 0000 which is cylinder 555, right where alternate tracks would be if this were a physical 3350 volume.

I researched the error and discovered that the only way to initialize virtual volumes on the P390 is to tell ICKDSF that it is a minidisk, even when it is a full sized 3350. This parameter, MIMIC, will skip the check of the alternate tracks that is causing the problems. Next time I bring up the system, I should be able to initialize the pack properly and from there I should be able to restore DOS/VS.

I got the pack initialized properly as a DOS volume with the VTOC at the end, but the standalone restore program is still crashing with the error (invalid disk drive). I ran out of time to poke at this longer.


I finished the validation that my replaced components had full connectivity - installing a few bridge wires where that was necessary - and am now ready to start testing once again.

The testbed is the raw DC power into the servo board, the two external heatsinks with power transistors, and jumpers to return the -5V, +5V and ground sense returns. This should give me a safe subset to let me bring up the power and test each part of the regulation separately.

When I first brought it up, I saw the op amp smoking and the transistor it feeds. I had thought I had isolated the power sections and was only testing the +20V raw feed, which should have only driven the +10V output. The parts that smoked, however, are on the -20V raw feed side, which produces both -10V and -5V outputs. The last of the regulated supplies, the +5V one, is fed by a +10V raw feed.

I validated that it was not supplying -20V raw, only +20V. That should not be able to energize the circuitry on the negative side - there are just a few diodes and capacitors to ground and I don't see a legitimate path for the +20V to get down to the parts that smoked. This is a good clue even if a bit opaque at first.

Saturday, July 4, 2015

Desoldering station working, not so much the ztex board, and continued progress with the P390


I continued to fiddle with the DOS/VS boot-able utilities, to figure out why it didn't like the 3350 disk drive I was trying to initialize. I expended the card images in the hope that my problem was due to short card images. Same error. I am now left to believe that the lack of alternate tracks on the emulated disk is somehow flagged as a problem, or there is some other incompatibility between the dos standalone init program and the P390 emulated drive.

I am setting up a boot-able tape image of ICKDSF, the standalone format program, which will format the pack in the OS rather than DOS style. The only difference is that DOS does not use the type 5 record which records free space on the volume and therefore it flips on a 'DOS contamination' bit.

This change happens any time that DOS mounts a pack - the contamination bit is flipped on and the DSCB record is set to zero. If the pack is mounted by MVS, it recreates the format 5 record with an accurate free space count then resets the contamination bit. Therefore as soon as DOS restores the distribution image to the disk, overwriting the VTOC, it will have zero space and the contamination bit properly set.

I moved the P390 and monitor to a more convenient location inside the workshop, since my checkout (and playing around) is going on longer than I expected. I can now sit down with the monitor at tabletop height right behind the keyboard and mouse.

There are so many little details, commands and conditions to remember for each operating system - having to sweep aside four decades of mental cobwebs - but it is slowly coming back. I just remembered that I could 'punch' the standalone initialization program into the virtual card reader in a virtual machine, IPL that card reader to do the task. That means I wouldn't need to prepare the virtual tape image.


I poked around and found that the connections to the power supply transformer for low voltage for ICs were a bit intermittent. With a bit of manipulation I was able to restore the unit to operation and used it to remove my burnt op amp in the Pertec servo board.

With all the desoldering, the copper rings around the openings are gone and some of the traces arent close enough to take solder reliably. I put in an IC socket, to avoid this happening again, and then tested and restored connections as necessary to get this socket wired properly. It was tedious work but completed eventually.

While I am at it, I looked at all the other locations where I removed parts - all the tantalum replacements and the other fried op amp - to test every pin for full connectivity to the rest of the board. I was not done by the end of my worktime today.


I tried all the examples and demos in the SDK as a way of learning something about the problems I am having, but essentially all the demonstration programs load the bitstream to the fpga as the Java code on the PC starts, so that it doesn't matter if the fpga board won't boot up on its own. This masks the problem.

Friday, July 3, 2015

This and that, mostly P390


The PCOMOS2 3270 program used on this system is somehow not passing the PA2 or Clear key codes to the running OS (VM/370) but instead resetting the session in a way that forces the user off the terminal and makes them log back in. It displays a PROG775 message as well. Ideally this gets fixed so that I can hit the clear key to move forward when a terminal screen sites in the "More" state.

For the time being, I have overcome this problem by changing the timing of the 'more' function down to 5 seconds from the default of one minute. It is much more tolerable this way.

I set up a DOS virtual machine to do the DOS/VS 34 sysgen from the distribution tape. I had to fight the P390 system to produce output files from the spooled printing that everyone insists on doing - DOS to VM to P390 to OS2. I see an error message for the standalone disk initialization step saying that X'150' is not a valid disk drive. Progress.

This is typical of mainframe systems - it can take a while to figure out what is not working right; it is an environment with quite limited feedback. Sometimes you have to go to the source code, look up what causes a given message to be issued, and then step through the disk or memory to locate the trigger. Even then, it may take a bit of inspiration to figure out how to fix it, although usually the course of action is clear by this point.

I continue to have to reconfigure the settings each time I power on, because the CMOS battery is dead and the replacement is meandering through the lazy, unreliable USPS system - listed online as due for deliver yesterday but no sign of movement since it hit Dallas five days ago.

Went through a few tests, changing things, but still getting the error on the initialize disk utility. After I shut everything down, it occurred to me that this might be a very simple issue that the card images I am sending to the virtual card reader are not a full 80 columns long. I am not sure of the behavior of the spooled card readers - if they pad then everything is good, otherwise it might be looking at garbage on the remainder of the card buffer, causing the error message.

Frankly, since the error concerns the suitability of the disk drive, it might be an issue with the emulation of the drive - lack of CE tracks or something. At this point, I could initialize the pack with a known good program, then skip over the utility on the tape to run the restore job.


I have heard from the owner of the 1800 system that he has been working on the 2311 disk drives as well as mastering the design of the 1800 processor itself. He just finished restoring a couple of ASR-33 teletypes and a Honeywell 316 minicomputer. His pictures of the cleaned up disk drive just sparkle - he has certainly cleaned it up superbly.


I can't find any manuals at all about the PP-8087/u rework station I have, which looks like it was built by Pace for the US military. I can't find a close enough equivalent Pace model to get the documentation. Unless I am lucky and find something that is causing it to fail to stay on, I may not be able to make use of this any more.


I have reloaded all the SDK code from scratch, used the files that worked properly at Richard's home, but they fail exactly the same here. This is quite frustrating.

P390, SAC Interface ztex board and other work

In the late afternoon, we heard a loud crack as the power went out in our area. A large tree had cracked, fallen across high voltage lines and snapped them. Through most of the evening we heard chain saws and cranes working, then slept in the dark until power came on in the wee hours. Needless to say, it cut sort my work on the 1130 and related projects.


I brought over a different DOS/VS tape image but still get the wait state when I try to IPL the tape, which doesn't start up when I 'ready' the JCL deck in the card reader. Not sure what is happening but will keep investigating. I can see that the P390 has an interrupt pending for the running code, but it isn't enabled for interrupts I suspect. On the other hand, it is not a disabled wait state either.

I can just do this all under my vm/370, which should at least give me more useful status if it doesn't work properly. That is, once I can power this up again.


I received some new desoldering tips for my Pace station today, which should make removal of the newly damaged replacement op amp easier. I put in the new tip, switched on the desoldering station, and . . . nothing. The fuses were all good but when I flipped on the main switch, nothing worked.

I opened it up to check for presence of voltage, including to the main rocker switch that powers it on and off. While manipulating the connections to the switch to test voltages, suddenly it would light up for a few seconds and then go off. The pump motor tried to work while the light was on, which means this is a true power issue and not just the bulb in the switch.

There are times I have to hang on tight to my rationality, lest I believe there are malevolent spirits just waiting to toy with me like this. Why would it fail right at this moment? Putting aside the yoyo of my emotions, I spent a few minutes but then closed it up and put it aside. I was already near the cutoff for fooling around with the Pertec drive and its power supply issues; this pushed me over the edge. For the time being, both the Pertec and this desoldering station will fester in the corner while I work on other things.


Richard Stofer and I are still battling the Ztex board and Xilinx toolchain, trying to figure out why I am having the problem getting the board to autoboot from the flash. Richard has some ideas which I will investigate, plus I will look at setting up a second Xilinx system at a different revision level to see what happens.

It is all a bit mysterious. A version of the bitfile that he produced and loads successfully on his end will configure the fpga to work but won't autoboot from flash on my system. My version of the bitfile loads and autoboots just fine on his end.

I was downloading the latest version of the Ztex SDK when the power went out.

Wednesday, July 1, 2015

VM/370 up and running on P390, digging into the ztex fpga board issue for SAC Interface box


I found a five pack VM system set up to run with the 4K protect key modification - loaded it up on P390 and brought it up successfully. While I was working on the system and remembering my CP and CMS commands, some very unusual raindrops began falling (unusual to have any rain at all in July). I had to move the monitor, keyboard and mouse inside, necessitating a shutdown.

I went back to the DOS/VS generation but continue to have a problem booting the distribution tape, which should have on it a disk init utility, then the disk restore program and finally the contents of the disk pack being restored. It just won't boot, which causes me to suspect that I don't have a good file of that distribution tape. I say this because I used a similar tape file for the VM process and it worked flawlessly.


I moved my fpga bitfile over to the home deskside computer, which also had the ztex software installed, and loaded the file to the ztex board flash. It reported successful loading but once again wouldn't boot from the saved image. The problem is clearly in the way that Xilinx ISE generates the file. May have to install a second instance of ISE on the deskside machine just to produce bitfiles that work with the Ztex board.

Richard Stofer is also investigating this using my project and bitfile, but on his end using his systems. We are exchanging some test files now to try to zero in on the problems I am having.