Saturday, July 25, 2015

Seeking okay now verifying reading functionality of Pertec D3422 drive

PERTEC D3422 DRIVE RESTORATION

I began to suspect that my problem had more to do with when I deasserted the Seek Strobe, relative to the address lines changing, since the drive acts on the trailing edge of the strobe signal. I made some changes to the FSM based on that suspicion and gave it a quick test last night.

Video of drive under control of fpga testing controller - watch to see it spin up, seek out and back.

The mechanism was now seeking out and in with the half-range (out 203 cylinders and then back to home) which corresponds to a 2310 disk drive capacity. This morning I ran some more tests of different seek amounts including looping on 203 and 406 cylinder long seeks, and some very short seek loops. It all worked well - jumping between 0 and 405, small range seeks, looping seeks. I also validated that I get the Address Interlock signal (invalid seek address) with an address beyond 405.

I invested my morning in writing the initial code to handle reading from the disk. The data is encoded with a self clocking scheme similar to that used on tape drives, where gaps between sectors and fields are uniformly magnetized but all data is recorded with a clock at double the data rate. My extracter circuit signals me when each bit is available and makes the value available.

The self clocking scheme means that any pattern of data, whether the bits are zero or one, all start with a flip of the magnetic field to mark the clock and then a second flip only if the data will be a logical 1. If the data is zero, the second time interval has no flip. Each pair of intervals, the clock pulse and then the data-dependent second pulse, is called a bit cell.

Each recorded area on a disk begins with a preamble of a couple of hundred zero bits. This is a pattern with the clock pulse and the absence of the second data pulse. This allows the receiver electronics to synchronize with the pulses and know whether a pulse it sees is the clock or the data pulse.

The 200 bits allow it to get synchronized so that the electronics in the drive can split out the clock signal and the data signal. My fpga recieves a steady uniform set of pulses on the Read Clock line and during the second half of each clock cycle, if there is a pulse on the Read Data line, then we record the value as a logical 1. No pulse from Read Data in the second half of a clock cycle and we read this as a logical 0.

Following the string of zero bits is a specific pattern that is recognized by the controller, which so far is synchronized only on bit cells and clocks, but not on byte or word boundaries. The specific pattern defines which bit cell is the beginning of each byte/word of data. Following the special pattern is whatever data format was recorded, ended with an error checking Cyclical Redundancy Check of 16 bits and another specific pattern marking end of the record.

The Pertec drives have an initial set of data the record the cylinder, head and sector number of this record. It should certainly match the cylinder to which we did a seek and the platter, head and sector from which we are reading. After the CRC for the header, there is an erased gap, another 200 bits for clock synchronizing, another specified pattern for byte synchronizing, 256 data bytes, then the CRC and end character. This happens 24 times around the track.

IBM 2310 drives don't have a formally separate header. They use the same preamble of 200 bits of zero, a specific pattern for byte boundary synchronizing, but then have 321 16 bit words in a monolithic sector, capped with a CRC field and end character. By convention the first of the 321 words contains the cylinder/head/sector information, as a relative sector from the start of the pack, leaving the remaining 320 words of the sector for data. The 2310 drive only provided four sectors around a track.

I began my logic design with the logic to convert the Read Data and Read Clock inputs into a stream of serial bit values (extracter circuit). Another bit of logic recognizes the byte boundary synchronizing character (synchronizer circuit). A third bit of logic then takes a serial stream of bits and turns it into parallel bytes (assembler circuit). A higher level state machine would move from gap (idle) state to preamble to byte synch and then store data bytes in a memory on the board, but I haven't designed that yet..

When I tried to test this I quickly ran into the problem that unless I was immensely lucky, switching on the read enable before or in the midst of a preamble, I would run into a character that was neither 0x00 or 0xFF, triggering a sync error. This leads to two changes - first, I need to recover from sync errors rather than lock in the status, and second I need to synchronize the start sync operation with the beginning of a sector.

Initially, I set up the controller to look for sector zero and read that - index pulse triggers the FSM and then I enable the read electronics, begin synchronizing and let it rip. The synchronizing logic looks for a string of at least 100 zero values, then eight sequential 1 bits, after which it should be assembling bits into bytes and flagging their availability.

When I see the sector counter change from zero, I will turn off the read electronics which stops the flow of clock and data pulses. I didn't check CRC or handle the gap and data record. I only cared about the header and its confirmation that I was reading the proper sectors.

In my testing, I discovered that the +5V Timer Board feature will power down the drive after 6 minutes, whether or not I am doing seeks or reads. I needed to figure out what signal states are needed to keep this from happening. It turns out that while the normal practice is to drop the line to activate it, hold it for a millisecond or so, then return it to high, that allows the timer board to count down and shut off.

My logic now has to drop the start/stop line, keep it low, then if I push the button again to shut down, it will have to return the line to high for an interval, pulse it down, then leave it high. It adds a few stages to the FSM. After testing, I still have problems. If I hold the line low after the drive starts, it locks out the button on the front of the drive. If I leave the line high, the timer board shuts things down in six minutes. This is being punted until tomorrow, as it is lower priority than completing the read testing.

I can see my data recovery working, but something is going wrong on the path to assembled bytes. To debug, I routed several of the key signals out where I could watch for them on the scope. My extracter logic is working perfectly - when I turn on reading, the data values begins to fly across the wire.

I could see that my method of detecting the start of sector is inadequate, thus I had to improve it in order to definitively locate and start reading at sector zero. I thought about it for a while and updated the FSM accordingly.

Now I am clearly oriented to the beginning of sector zero. I see the bits being extracted, a run of zeroes followed by an all-ones bytes (xFF), which should kick off my byte assembler but it isn't. The synchronizer logic is where the fault lies, so I will ponder that a while.

My seek loop would stall after many repetitions - the fault appears to be in my testing FSM which is waiting for busy to blip up and down. I suspect that some conditions occur where the rise has already happened before I test for it. 

No comments:

Post a Comment