Wednesday, August 3, 2022

Bitstream generation and disk modelling verified through simulation


As you remember from prior posts, each bit written on the disk is an approximately 1.4 microsecond interval called a bit cell which is divided into two halves that can record a pulse by switching the magnetic field direction. No switch means nothing is written and nothing is detected by the read head.

The first half provides the clock and the second half has a pulse if the data value is '1', while if it is absent we impute a data value of '0'. There is always a pulse in the clock half of the bit cell, this is how the drive achieves self-clocking.  A long train of words of all zero provides nothing but the clock pulses and trains a data separator to recognize which is the clock half and which is the data half, so that the pulses can be separated and sent out their own clock and data signal lines. 

Thus, we need to have a way of training the data separator and that is achieved by a fixed format for the sector. Each sector begins with a sector marker, a pulse that is 160 us long. Having masked every other physical sector marker, we see only four of the eight and that defines the four sectors following the index marker pulse. 

At the falling edge of the sector marker pulse, we begin writing all zeroes, a clock pulse followed by no pulse in each bit cell. This is produced by the IBM 1130 device controller circuits for 250 microseconds, after which a sync word is written. This is a word whose high order bit is 1 - 0x8000 - with its proper error checking bits. Immediately after this sync word of twenty bit cells is written the device controller commences writing the 321 words that fill this sector. 

Each word of 16 bits from the CPU is augmented with four error checking bits at the end to yield 20 bit cells going onto the disk. The bits stream onto the platter from the low order bit up to the high order bit then the four check bits are written. There is no delimiter between words, we have only a continual stream of bit cells and depend on the device controller logic to break them into words of 20 bit cells and then data words of 16 bits going to the computer. 

From the moment the sector marker pulse trailing edge is seen, we have a stream of about 6,620 bit cells must be divided into 321 words of 20 bits each. The sync word pattern following the stream of about 180 zero bits allows us to know that the very next bit cell is the low order bit of word 1 and that every 20 bit cells thereafter is the low order bit of the next word until we have read or written all 321 words.

A sector is nominally 10 milliseconds long at the 1500 RPM rotation speed of the disk platter, minus the 160 microsecond duration of the sector marker pulse. Our pattern of zeroes, sync word and 321 data words burns up about 9, 268 microseconds of the 9,840 us available on a sector. We need some safety buffer because the disk rotation speed can vary in the real world,  the physical slot that produces the sector marker might be inaccurate, the length of the generated sector marker pulse can vary, plus the oscillators generating the bit cells can vary a bit. 

In an ideal world we could have fit another 20 words into the sector but if the sector marker pulse rises before we have processed the last bits of the sector we generate an overrun error and have to abort the read or write.  


Upon startup, I wait until the platter has rotated to an index marker, which sets up my logic (and sets up the IBM 1130 device controller logic) to treat the next sector marker as the beginning of sector 0. I block any bit cell generation until I have encountered the index marker.

The sector modelling logic is triggered by the sector marker, beginning to write bit cells of data value '0' after the fall of the sector marker pulse and continuing for 250 microseconds. It then writes the 20 bit sync word pattern B00000000000000011110 which is x8000 with the proper error checking bits.

I then write successive words as 20 bit cells, counting the words as I go. Each time a new word starts, a read request is pushed into the FIFO queue for the RAM with the address corresponding the cylinder where the arm is sitting, the head selected, the current sector number, and the word number. 

I use the returned word from the RAM response FIFO to shift out the 16 data bits, bit position 15 first and continuing leftward until we get to bit position 0 at the high order end. As each bit is written, any '1' value bits are added to a running counter. When we have finished with the 16 bits of the data word we produce the appropriate four check bits based on the running counter value. 

As previously mentioned, the purpose of the four final bits is to detect errors. The more obvious way it does this is by sending 1 bits until the total of all 1 valued bit cells is evenly divisible by four. Depending on the number of '1' bits in the data word itself, there can be 0, 1, 2 or 3 additional '1' bits that must be written. The device controller verifies that the full 20 bits read in have '1' bits that are 0 modulo 4 and throws up an error if this isn't true.

The second and less obvious error checking, not implemented in the IBM 1130 but possible due to this error checking scheme is to validate that the last bit written is a '0' value. It must always be '0' because the controller of the IBM 1130 writes either B0000, B1000, B1100 or B1110 to make the 1 bits a multiple of four. Three bits are sufficient to accomplish this, thus the fixed fourth bit of 0 is another kind of error checking and may play a role in ensuring that the data separator remains able to distinguish which pulses are clock and which are a 1 data value. 

At the end of the day, I had simulator runs showing me that the stream of bits were produce exactly to this schema and at a realistic timing that matches what a real world cartridge would produce through the head. I had a signal whose level varied beween +3V and 0 to feed to the disk drive electronics at a specific point. 

The drive spots magnetic flux reversals, converts them to one polarity regardless of the way the flux swings and from that produces a pulse for a reversal. The pulse is turned into a transition of a logic signal from 1 down to 0 for the length of the pulse. Thus, my signal stream is interpreted at this point as a pulse for as long as the level is 0 and absence of pulse all the time is stays up a +3V. 

The duration of my pulses are set to 0.4 microseconds which fits within the 0.7 us bit cell half and enough separation between the timing of the clock and the data pulses to be properly separated. 

There is an esoteric effect on disk drives where the timing of the pulse detected 'shifts' based on surrounding flux reversals. The separator has to accommodate this time shifting without errors. I don't think I have to shift my own pulses but would be prepared to add this in if it becomes necessary.


Sector spanning view
The section above shows more than one sector, so that you see the sector number change and the bottom stream of pulses that are the 6,620 bit cells produced for the sector. The top line is the sector number, the next down is the word counter. The two lines between the pulse stream and the word count are the state machines involved in sector modeling and bit cell generation. 

Beginning of a sector 
I have zoomed in a bit to show you some detail in the state machine and word counter values plus the pulse stream begins to show distinct patterns as the data bit values change. 

Bit cells visible

This final screen shows more detail so that the 20 bit cells of the sync word and bit cells of data words can be discerned. For this testing I produced a fixed word value of x5AA7 for each of the data words, which you can verify by decoding the bit cells and validate the check bits are correct.  


All of the above depends on the RAM having returned the proper word in time for my bit generation circuitry to turn it into bit cells. The request for the word is generated just before we begin writing the clock pulse of the bit cell. We have 35 FPGA cycles for that half of the bit cell and then 8 cycles into the next half before the data value must be present. 

The RAM itself will return the data word in about four FPGA cycles, but we also have to traverse a FIFO for the request and then a FIFO to pass back the answer - these take a few cycles each. On paper I have enough time since the RAM is doing nothing except serving up our word read requests. 

The reason for the FIFOs, by the way, is to deal with the different clocks involved. The DDR3 RAM operates with 100MHz and 200MHz clocks, while my logic is running at 50MHz and that is not exactly in phase with the clocks of the RAM. 

The FIFO is implemented to act as a buffer to accommodate the dual clock domains, one on each side of the FIFO. It will have either zero or one item in the queue at any time, not really storing up a queue of requests. One FIFO from 50MHz to 100MHZ for requests and then a second FIFO for responses from 100MHz back to 50MHz. 

I won't have real data in the RAM during the simulation, but I can validate that all the triggering takes place. My sector modeling logic must trigger a request into the request FIFO before each word, the RAM side must see the request and pull it off successfully, and the data must be set up to read the RAM.

I then have to watch the state machine driving the RAM to see if it seems to toggle the control, address and data lines at the proper times. Assuming that is good, then when I see the signal from RAM that data is valid, I must see the data pushed into the response FIFO and my logic pull it out on the far end to put it into the word buffer for the bit generation circuitry. 

Once I get through all this checking, then if the data is able to be written into the RAM prior to disk operation, I will have some confidence that the disk drive electronics will be seeing the right stream of pulses to turn that into 321 words for the CPU. 

No comments:

Post a Comment