Rescue 1130: 2014 Pickup of an IBM 1130 System and More: Contemplating replacing Diablo data separator circuit with fpga based phase locked loop

ALTO DISK TOOL

I developed a watchdog timer in the fpga that will emit a signal if the time between clock pulses is 700ns or longer - this will help me flag spots where the Diablo drive is misidentifying a 0 bit with late clock as a 1 bit and missing clock. I can then reliably trigger both logic analyzer and oscilloscope to see the behavior of interest.

Data is recorded on the disk with a non-return-to-zero encoding (also called frequency modulation which evolved to MFM for the first PC disks), where bit cells of 600 ns are written by reversing the magnetic flux once or twice. At the beginning of each bit cell, the flux is always flipped, which is the clock. In the middle of the cell, if the bit in question is to be a 1, another flux transition occurs, whereas a 0 bit value does not cause a reversal.

This is why disk records start with a long string of zero bits, to help the reading circuitry to synchronize on the clock bit as the start of a bit cell. The clock bits are the only transitions when the data values are all zero.

The start of meaningful data is signalled by a special non-zero pattern, which for the Diablo is a single bit cell with a data value of 1. The reading circuits pass through the flux reversal as a clock pulse and then flip the logic for less than 500 ns so that any reversal in that interval will be passed as a data bit value. This window of time where a flux reversal is a data bit is how the circuit separates clock from data pulses.

I have found spots on the disk where a record gets a checksum error, at least on some read attempts, because the clock pulse is misreported as a data pulse. Specifically, the bit cell is meant to contain a value of 0, so that the flux reversals are the clock at the start and end. In this case, the next reversal which is intended to be the clock is passed along as a data value of 1, because the timer window is still open. The clock pulse arrived a bit early.

Now, when the clock pulse following a 0 value bit cell is misrouted as a 1 data value, and the next bit cell is also a data value of 0, then we will have a longer than usual gap until the next flux reversal. The two 0 bit cells are collapsed together as a single 1 bit.

Out of the data separator circuit, we see the clock pulses skip a beat, with 1200 ns between pulses at this point. The data pulse is also wrong and more sinisterly, the serial train of data bits is compressed with two lost 0 bits reported as only a single 1 bit. This misaligns all the remaining data bits and checksum for the remainder of the record.

It is possible that I could detect and correct for this error, but it would be complex. I would need to establish the bit rate, adjust to stay synchronized with the clock transitions and then look for malformations. One such would be a skipped clock pulse coupled with a 1 data value. I could morph that into a pair of clock pulses representing the two data bits of 0.

I would need to look carefully at all the possible cases - 0 bit cell where this happens followed by a 1 bit cell, as well as other cases where clock and data are misrouted by the data separator circuitry. In essence, I will combine these to produce the flux transitions and separate them myself using more sophistication than a simple timing window.

I can also imagine errors where the pulse is split between data and clock, when the timing window elapses somewhere in the middle of a flux reversal pulse. If I combine the ReadClock and ReadData signals myself to form the original pulse train before the separator does its task, I can bypass the source of these errors and accomplish the detection of bit values correctly.

I need a second order digital phase locked loop to recover the clock properly, then use that to indicate bit cell boundaries and watch for the data 1 pulses. The incoming signal is messy, having timing jitter due to rotational variations on both write and read passes plus bit peak shifting. It may also suffer from minor variations in the media surface and from oscillations of the head heighth.

Beginning with single density floppy disks and PC era hard drives, the written signal was 'pre-compensated' to minimize the shifting of pulses upon reading, but no such technique was used with the Diablo drives.

Floppy disk data separator chips were available but are intended to support the much lower bit rates of the original SD and DD floppy drives, 250Kbps to 500Kbps whereas the Diablo is streaming data bits are 1.67 Mbps. The phase locked loops available on the Spartan 3E fpga chips won't operate at so low a frequency, requiring minimum frequencies in the range of 10Mhz.

Thus, I have to work out a different method. Opencores has a DPLL although I am not certain how good it will be at handling disk signals.The goal is to generate a train of clock outputs that is synchronized to the time averaged clock pulses observed from the head, then use that to cleanly separate transitions that are far enough from the clock to be considered a data bit value of 1.

I will implement this in the fpga, digitally mix the ReadData and ReadClock as an input, create the scaffolding for the data separator based on this, then compare the results of this versus the Diablo separator circuit. This is going to be complex design work.

Rescue 1130: 2014 Pickup of an IBM 1130 System and More

Friday, December 9, 2016

Contemplating replacing Diablo data separator circuit with fpga based phase locked loop

No comments:

Post a Comment