Friday, November 4, 2016

Possible situation causing misreads


I spent the day poking at various locations, observing the signals and trying to find a cause for intermittent checksum failures at these sectors. Nothing jumped out at me but I will keep looking for a clue that is reproducible and can be clearly tracked to a cause.

My working hypothesis is shifting of the clock timing. There is a specific point where the clock pulses don't arrive 600 ns apart, right before two data bits that sometime clock in as 1100 and sometimes as 0110 meaning I am skipping a clock cell ahead sometimes. The later pattern, 0110, when it sees an extra 0 value that doesn't exist, is when the checksum error occurs.

Sometimes this is captured as  01101100 and sometimes 01100110 
Clock bits where this happens are irregularly spaced

Therefore my logic to recognize clock pulses, set up the data bit values, and push them into the deserializer has some weakness that is triggered by the slightly deformed timing of clock pulses. I will have to stare at the state machine, change the data being emitted to the logic analyzer, and test some more to spot where it goes awry.

Looking at the two sides of the differential amplifier decoding the head signals,we see the following:

one side of differential amplifer

other side of differential amplifer
The signals from the differential amplifer look well formed and about the same, except for a bit of bias (notice the one bit at left is rising peak to peak in the bottom scan, but is more flat peak to peak in the top scan). I can't do anything with this observation (yet) but am saving it just in case it becomes relevant later.

If somehow the decision to count extra bit cells is due to a weak process in my fpga logic, then I could improve reading quality by redesigning this part of the hardware. I will look into a different way to handle the incoming ReadClock and ReadData pulses that might avoid this problem.

No comments:

Post a Comment