Saturday, October 15, 2016

Read sector functionality of the Alto-Diablo disk tool is working well, minor issues to address overall

Morning spent as Volunteer Examiner administering ham radio license tests, back to tool in the afternoon.

ALTO DISK TOOL

I came up with a new scheme for the deserializer operation during idle time at the exam session, then implemented it when I got back to my house. It has separated the bit timing into discrete state machines.

One FSM is tracking the timing of the Read Clock pulses, another will look for a '1' data bit value on Read Data only at the appropriate portion of the Read Clock cycle, and the third will handle recognition of the sync word 0000000000000001 when beginning to read any of the three records in a sector.

I trigger the deserializer to collect bit values with some combinatorial logic that looks at the timing in a Read Clock cycle and the sync or unsynced state. This should tie into the deserializer module itself without requiring a rewrite.

Another thing I had suspected from the data recorded on the disk is that the Head selection signal is inverted (logically) compared to what the software requests. That is, when the disk interface signal is on (0V level), the top head is selected.

This would normally represent a '1' on the Head line, but an extra inverter is introduced on the Alto disk controller card (via an Intel 3404 latch) such that the software uses a '0' to drive a '1' on the head line which means it drops to 0V and the drive uses the top head. I have to invert my logic in the disk tool to match this usage.

After it was turned into a bitstream, I fired up the testbed and read a sector again. The logic seems to be working better, but I have a problem with timing, between when I sync and when I record bits for deserialization.

I added an 'armed' stop to the sync state FSM as a way of blocking that bad behavior. I ran again and did indeed sync properly and deserialize the data words properly. The flaw I am currently working on is that the ReadSector FSM is jumping to read the middle record, the eight byte label record, rather than knowing that it is handling the two word header record.

This flaw causes it to read the checksum word correctly but it is not recognizing it as the checksum nor doing the comparison for validity. I have to debug this condition but feel that the basics of extracting parallel words from records is in good shape.

What I saw was the logic trying to sync up at less than 80 us into the sector, far too early. It should be obeying a preamble delay of 201.6 us before looking at any incoming data bits. I have to correct this. I do see the proper sync word at around the nominal 336 us into the sector.

I found the problem. Doh. For some reason, when I calculated the number of fpga cycles to produce given durations, I was using 50ns as the cycle time but this is running at 20ns. That means my delay counts need to be 2 1/2 times as big to produce the intended delay. I made the corrections, invested the half hour in producing a bitstream, and tested again.

The first time I read a sector after the disk tool is reset, it appears to read the entire sector properly with no checksum errors. I verified that the first two records were properly read and their checksum matched what was read from disk, but I didn't go all the way to the end of the disk to check the tail of the long data record.

However, each subsequent try seemed to ignore the work I did on the long preamble, instead syncing on the earliest 1 bit it found regardless of the preamble. I have some defect in how one or more state machines resets when it completes one cycle.

I had made great strides by dinner time, with a few bugs left to iron out and some steps left to check but significant parts of the functionality working properly.
After dinner, I looked over logic and set up various diagnostic signals to help me sort out what was happening. I discovered the counter was too small inside the ReadField state machine to handle the new, larger and now correct counts being used for delays.

It was also helpful to build up a table of cycle counts from the sector mark to various words that were part of the sector being read, so that I could program in reasonable delay counts to the logic analyzer trigger to observe any 80 us wide interval I wished. There are roughly 42 such intervals in one sector, stacked end to end, making the observations slightly tedious. 

I worked through several sectors and can now consistently read the entire sector correctly, matching checksums, storing contents in memory and restore to where I can read another sector. I checked against the full set of items to verify and have met all of them:
  1. Readsector logic is waiting for a sector match
  2. Indexmarker occurs, signalling that next sector mark is for sector 0
  3. Sectormark occurs, thus we are in sector 0 now
  4. Gotsector is emitted to indicate we have a match
  5. Readsector logic moves on to setup for record 1 of the sector
  6. Readfield logic is triggered
  7. Readfield waits for approximate 200 us preamble before looking at incoming data
  8. Roughly 120 us of zero data bits are seen
  9. A 1 data bit completes the sync word
  10. the logic recognizes the synced state
  11. Two words of the header record are deserialized, extracted and saved
  12. The checksum is calculated correctly for the two words of the header record
  13. The next word is deserialized, extracted and used as a checksum
  14. The checksum verification test occurs properly
  15. The readfield logic completes
  16. The deserializer goes to the unsynchronized state
  17. The readfield logic for the next record, label, begins
  18. The appropriate preamble is passed before we look for sync
  19. Enough zero bits are read to properly set up sync logic
  20. A 1 bit is read and the sync condition is attained
  21. Eight words of the label record are deserialized, extracted and saved
  22. The checksum is properly calculated for the 8 label words
  23. The next word is deserialized, extracted and used as a checksum
  24. The checksum verification test is done correctly
  25. The readfield logic ends
  26. Sync is dropped
  27. The readfield logic is entered for the data record
  28. A suitable preamble is passed before attempting sync
  29. Enough zeroes are read for the sync engine to work properly
  30. A 1 bit arrives and we attain the sync condition
  31. 256 words are deserialized, extracted and saved as the data record
  32. The checksum is properly calculated
  33. The next word is deserialized, extracted and saved as the checksum
  34. The final checksum test is done properly
  35. Readfield logic ends
  36. Sync is dropped
  37. The readsector logic completes
  38. Appropriate completion status is set in Reg0001
  39. The next sectormark does not occur until after step 36
I observed one problem and one discrepancy from the model of a disk sector to which I am engineering. The discrepancy is that we seem to be displaced about 60-70 us, from the times I would have expected from the Alto microcode. The problem is that the disktool sometimes gets into a state where it will no longer read sectors, although disk seeks are still functional.

The length of a sector mark is only 5 us, far too short to account for the added time. I am forced to speculate that the word task is doing other things or the execution time of the sector task is long enough to result in this delay from the 'ideal' timing.

Ideally, a sync word will exist right around the 34-35th word time, roughly 330 us into the sector, but real data on the drive has the sync word in the 390 to 400 us zone, Everything from that point is correspondingly shifted later in the sector, but it all fits neatly into the 3,333 us available.

The long preambles at the beginning of a sector will accommodate these delays in the start of a sector when reading and real Alto systems should be able to tolerate an earlier sync word being written by this tool. Therefore, I won't do anything based on this observation.

The hang is troubling. It occurs some random time after the tool is initialized, such that I am doing a number of seeks when abruptly the button does not respond any more. I believe that I have one or more of the state machines wedged in a non-idle state. I will test this with new instrumentation displaying the idle status of eight of my FSMs; any light that is out indicates a wedged machine. 

As a final test, I examined a file dump after having read sector 0, but all I can see is that the data recorded is plausible. I see the proper disk address in the header words and have eight label words that look very much like those on the label field of the two online disk images I grabbed from bitsavers. 

I found two online images, one named diags.dsk and the other xmsmall.dsk, but the contents of cyl 0, head 0, sector 0 which should be the boot sector does not match what I see on my disk. The sector 0 contents of those two images are different from each other, too, which adds to the mystery.

I will pass the data along to Ken to see if he can interpret it, certainly the header and label fields should pass some clear validity checks, and then he can determine if the data sector looks like well formed Nova instructions. 

I ran out of energy as it got late, but will work on the interpretation of contents, tracking the wedged state machine(s), and move forward on the high level 'Read entire cartridge' state machine. I am pleased that every sector I have read has come through clean, with all checksums matching. 

No comments:

Post a Comment