Tuesday, January 31, 2023

Simulation of the disk reading and modeling logic to validate that portion of the design

ISOLATION OF DISK MODELING FROM OTHER ASPECTS OF THE DESIGN

I abstracted all the logic that deals with the disk drive into one module. This detects the rotation, models the sections of a sector with accurate timing, and generate the clock and data pulses that will be fed into the disk drive electronics to fool the drive into believing it is reading a physical cartridge.

This module will also capture the stream sent from the disk controller to the drive, converting that into the 321 words of a sector and writing them back to the cartridge image. 

The abstractions means that this module has a simple interaction with the rest of the design, outside of the disk drive oriented signals that it handles. It sends a request to read or write a word of data, with the 16 bit data contents, to some other module which will interact with the DRAM on the board that holds the image of a virtual 2315 cartridge. The other module emits a completion signal, telling this module that its request is completed.

There will be 321 requests to read a word while we are in read mode on the drive, otherwise we will capture the written stream from the disk controller and request up to 321 writes of a word. This module does not directly generate the address of a particular word in a sector. Instead, the disk modeling emits the current cylinder number, sector and word within a sector, all accessible to the other module which will use that (plus the active head number) to index into the virtual 2315 cartridge image to the proper spot for a word being written or read.

The disk modeling logic watches the signals related to arm movement, mirroring the position of the disk arm from the real disk drive. Thus, when a request to move is received, forward or backward, 10mil or 20mil, the modeling adjusts the current cylinder number. The signal Home is used by the controller (and our modeling logic) to know definitively when we are at the cylinder 0 position. 

The disk modeling logic receives the Index Mark and Sector Mark signals, which are generated by features on the physical disk cartridge which is rotating in the disk drive. At the target 1500 RPM rate of the drive, we will see a Sector Mark every 5 milliseconds and the Index Mark once per 40 ms rotation. We use these to count off sector numbers, as well as kicking off the timing for the 10 ms duration of one sector. 


We model the transition of the sector from its starting point as the Sector Mark pulse ends, through 250 microseconds where we send continuous zero bits, past the fixed sync word and then loop through the 321 words of the remainder of the sector. Each word is 20 bit cells long on the disk, 16 data bits plus four ECC bits. 

A bit cell consists of two halves, each of those 720 nanoseconds long. The first half is the clock interval where we always emit a pulse as the clock. The second half is the data interval, where the absence of a pulse indicates a zero bit and the presence of a pulse indicates a bit value of one. Thus, every 1,440 nanoseconds we have a bit cell whose data value is either 0 or 1. Over the course of 28.8 microseconds we see the 20 bits that encode the word bits 15 to 0 and the four ECC bits. 


Our disk modeling logic precisely times the passing of the sector, emitting the pulses onto the read head that are clock and data values of each bit cell, organizing each word at the right time and including all the preamble of zero bits and the sync word in the beginning at the exact time they would occur with a 1500 RPM disk platter under the real read heads 

Writing from the CPU onto the disk involves a pair of clock pulses generated by the CPU - phase A and phase B which represent the clock interval and the data interval of a bit cell. During the phase B time when we are looking for the data bit value of that bit cell, sensing a pulse generates a 1 data value else we see this as a zero.

The disk controller electronics are producing those clock pulses and the data bit values according to the sector layout scheme, thus at the end of the Sector Mark we will get roughly 250 microseconds of bit value zero. We will closely synchronize when we capture the first data bit value of 1, which is from the sync word. We should see that 1 bit followed by three other 1 bits and a 0 bit, which is the correct ECC pattern for the word x8000 that is our sync pattern.

From that point forward, we count off phase B cycles and capture the data bits, assembling each 20 bit word and eventually all 321 words of the sector. As we finish assembling the 16 bit word value from the 20 bits we received, that is written to the other module to put into RAM. 

SIMULATION TESTBENCH TECHNIQUES

Macros

There are some interlocked signal exchanges particularly for the disk arm movement actions. A status signal for Ready will flip off once an Access Go command is issued, turning back on when the movement is complete. Rather than laboriously code the various signals in my testbench, I wrote a macro for arm movement that will properly sequence and time the activities to drive my disk modeling logic. 

Substitutes for interactions in other modules

My disk modeling module interacts with the rest of the system by raising a RAM read request, then waiting until the request complete signal is returned (along with a data word). For the write functionality, it puts a data word in a register and raises a RAM write request, waiting for the request complete signal to indicate success.

I wrote a process that loops waiting for the RAM read request line to go high, after which it presents the new data word and signals request completion, all with appropriate timing. This allowed me to inject a data word every time my modeling module asked for the next word of the sector data. 

Convenient file based test data

The testbench simulation environment incudes the ability to read and write to text files on my laptop. I used this to build a file to feed memory words to the testbench. I can easily modify the words I send and play around with different patterns. 

CAREFUL STUDY OF THE SIMULATION RESULTS

I could see that with realistic Sector Mark and Index Mark signals generated in my testbench, the rotation model produced correct timing - the sectors were exactly 10 milliseconds long. 

I produced sequences of arm movement requests to verify that the cylinder was appropriately mirrored to exactly match what the physical disk drive location would be. 

While the logic modeling the flow of every sector from end of Sector Mark onwards, it did not produce any read head data unless the ReadGate control signal was active. This is what the disk controller sets to ask the drive to stream the clock and data bits from the head into the controller. 

I carefully checked the duration and placement of every element within the sector for disk reads. This included the length of bitcells, clock and data pulses. It validated the length and proper formatted of the preamble of about 250 microseconds of zero bit cells. The sync word was properly formatted and timed immediately thereafter. 

Every data word was generated with its clock and data pulses, at precisely the correct time. Each word consumed 28,800 microseconds with the 16 data bits followed by four check bits. This repeated properly for all 321 data words of the sector.

I also verified that the word number of each word in the sector was properly generated for use by the other modules which will produce RAM addresses to read or write that word in the virtual cartridge image. 

Finally, the ECC bit patterns were properly produced based on the number of 1 valued data bits in the word. 

NEXT STEPS

Since I am comfortable with the disk read process as well as the disk rotation and arm movement logic, I can move on to test the logic that will capture writes to disk. These occur when the disk controller raises the WriteGate control signal. 

Writing is the last part of the disk modeling that needs validation. I can then build up the logic that requests the proper RAM read/write over the bridge to the ARM core side running Linux. That plus all the user interface logic remains to be coded and tested. 

Monday, January 23, 2023

Separated disk side logic from the memory, cartridge image and user interface logic

ISOLATED DISK ORIENTED LOGIC

I pulled apart the logic that pertains to the IBM 1130 disk drive. This section will respond to the control signals from the disk controller inside the IBM 1130, those that cause arm movement, activate reads or writes and transfer data. These are cleanly isolated from all the remaining logic of the Virtual 2315 Cartridge system. 

The interface is three control signals, a word address, a cylinder address, a sector address and two data signals. We can request a read, request a write, and see a response that the transfer requested is complete. We have a word to output and the word read as the two data signals. 

The word address is just the relative word number of the 321 in this sector. Only one other item is needed for full addressing of a data word from the cartridge image and that is the head number. Our platter has two sides, with a head riding on each one. None of the logic in the disk oriented section is involved with the head number, thus it is not handled here. 

In this logic there are four major sections:

  • Model the current sector based on the incoming Sector and Index marks from the physical drive
  • Model the bit position inside a sector based on the rotation of the actual disk drive
  • Produce clock and data bits for the read head to detect based on the word fetched from memory
  • Extract words from the clock and data pulses sent to the write head, saving them in memory

The format of a sector, starting with the trailing edge of the sector mark, is delineated into the Zero, Sync and Words areas, with 321 words written sequentially in the Words area. These are all positioned exactly in time where they would occur with the disk platter rotating in the physical drive. 

The Zero area is a steady stream of zero value bit cells written for 250 microseconds. This ensures that the receive circuitry can separate the clock and the data value bits. Each bit cell is divided into two 720 nanosecond intervals, the first always has a pulse to indicate the clock. The second of each cell has a pulse if the bit value is 1 but there is an absence of a pulse to signify a bit value of 0. Thus the zeros are send with a pulse in the first 720 ns and no pulse for the second 720ns. The detection circuitry locks on the pulses as clocks since there are no data value pulses during the Zero area.

The next word is 20 bits long, a single bit value of 1 followed by 19 0 bits. This is the word that lets the receiving circuitry know how to separate the long string of bits into words, since the initial 1 indicates the boundaries for words going forward. 

The remainder of the disk sector is 321 words long, each word being 20 sequential bits. The IBM 1130 word size is 16 bits, but for disk an additional four bits of error detection code is appended. When reading a word, the parity of the sequence of twenty bits is tested to detect if an error was encountered rendering the word just read corrupt. 

VERIFICATION ACTIVITIES

I set up a testbench for simulation which produces the control signals that we would expect to come in from the physical disk drive. First among them are the pulses from Sector and Index marks which show us when the platter has reached certain points in its rotation. As well, we see a signal produced when the disk is up to speed and the drive believes the heads are loaded down to fly just above the platter surface. 

Using these, the logic will produce the sector number that is passing under the head, switching as the Sector marks are encountered. Out logical cartridge has four sectors dividing up one rotation, although physically there are eight sector marks around the circumference. Our logic ignores every odd sector mark and thus mirrors the way the disk controller logic generates the sector number shared with the IBM 1130 computer. 

Knowing the rotational speed of the drive, 1500 RPM, the length of a sector in time can be calculated. It is 10 milliseconds, since one rotation takes 40 milliseconds at this speed. During each sector we divide up the time into 1.44 microsecond bit cells, giving us 6, 944 full bit cell positions. Our sync word plus 321 data words require 6,440 bit cells, the rest is the Zero area at the front and safety padding at the rear.

The specification for the controller requires it to produce zero bits for 250 microseconds, twenty bits of a sync word and 321 words of 20 bits each. The bit cell is 1.44 microseconds long, thus this sequence consumes 9,524 microseconds for the sector. Adding in the 250 us of the Zero area means that 9,774 microseconds are needed to write the full sector, and we have 226 us of safety time before the next sector begins. 

If the disk is rotating more than 1,534 RPM our padding will be consumed and our last word runs into the start of the next sector. The physical drive circuitry controls the rotational speed, which is enough inertia to smooth out short term fluctuations and allows a servo mechanism to provide good control. If for some reason the speed rises above the threshold of an error or falls too far below the target, the heads are pulled off the drive, an error is signaled and the drive must be cycled to restart operations. 

My testbench is used to generate sector and index marks including at slightly varied speeds to ensure we can generate all the bit cells to read a full sector reliably. My first testing is only generating the pulses to be fed into the read head circuitry. Later I will model the CPU controller sending the clock pulses and data bit signals produced while doing a write to a sector, ensuring I properly capture each written word of a sector and pass it over the interface to be written to memory. 

I am in the midst of verifying the timing of every pulse produced by the logic, ensuring that the Zero area is the right length, the bit cells are the proper duration, and then that the Sync and Word area words are properly emitted. 

Thursday, January 19, 2023

Narrowing down design after a bit of exploration of the DE10-Nano

COMPLICATIONS TO BE ADDRESSED

This board is divided into two sides - HPS and FPGA. The HPS (Hard Processor System) side has a twin core ARM system which we use to boot and run Linux. The FPGA (Field Programmable Gate Array) side has hardware logic that I create to manage the 1130 disk drive. 

Communications between the sides makes use of bridges built into the board. There are three bridges which differ based on which side of the board is the controlling entity (master in the widespread master-slave concept for computing systems). One is controlled by the FPGA, one is controlled by the HPS and one is a lighter weight, simpler bridge controlled by the HPS. 

Peripherals to the board are hooked to one side or the other, depending on the peripheral, thus they must be directly controlled by that side. The DDR3 RAM is hooked to the HPS side, as are some other peripherals such as ethernet and USB , while the General Input-Output pins, HDMI video, and analog to digital converter (ADC), among others, are wired to the FPGA side. 

My user interface will need either HDMI plus USB keyboard/mouse or the LCD touch screen panel. The LCD is completely attached to the FPGA side, while the other approach divides HDMI on the FPGA side and KB/mouse on the HPS side. 

In either case, it will be much easier to control the user interface with c code instead of hardware logic, but there are several design choices for where that code should run. Most of them require the use of the bridges between HPS and FPGA sides, although one of them does not. 

I can implement the c code on the HPS side and drive the HDMI or the touchscreen over the bridge. I can instantiate a soft process (NIOS II) in the FPGA, which consumes many logic elements on that side, which can directly access the touchpad and indirect control KB/mouse on the HPS side. There is also a set of Arduino Uno compatible pins on the FPGA, which I can use to connect an Arduino that drives its own user interface in c code and perhaps its own LCD screen shield, or the Arduino can directly drive the touchpad/HDMI and indirectly access the KB/mouse over the bridge.

All the bridge accesses map devices or signals to memory spans and then use master-slave memory access. Since I will be learning this for the essential access to the cartridge images between HPS and FPGA side, extending it to the various cross side accesses for the above user interface purposes is not much additional work. 

USING MEMORY MAPPED FILES FOR CARTRIDGE IMAGES

Linux supports memory mapped files, where I can open any of the SD card images of virtual 2315 cartridges for read/write, having the operating system map that into a 1MB range of virtual memory addresses, and then allow the FPGA side logic to do read and write to that memory. 

This automatically writes back any change to a cartridge to the file on the SD card, which eliminates the need to fetch back an updated cartridge image to rewrite as I had to do with the prior Xilinx/Arduino approach. 

All that is necessary is to have the user interface running on the Linux image on the hard ARM processor side of the board open and close the file, using the MMAP system call, then pass the memory start address over the bridge to the FPGA side. From that point forward, my FPGA logic can do reads and writes to the memory locations on the hard processor side. 

The signal that the physical 1130 drive has been switched off will trigger the Linux side to close the memory mapped file, while switching the drive on will trigger the Linux code to open and memory map the currently selected file.

The user interface shows the user a list of cartridge images and allows them to select which will be active when the drive is turned on. That selection can't change while the 1130 drive is running, just to protect the cartridge image we are using at the time. 

Friday, January 13, 2023

More on the new DE10-Nano board approach

POSITIVE CONSEQUENCES

My old design had to implement the memory interface to drive the DDR3 RAM, FIFOs to handle various clock domain crossings, and multiple clocks to drive the memory interface. All of these depended on intellectual property (IP) from Xilinx and these were the areas that were misbehaving in the old design.

The new approach has a single clock domain on the FPGA side and its AXI interface for communications between the FPGA and the Arm hard processor side implicitly handles clock domain crossing. Thus, no need for FIFOs. No need for defining clock PLLs or dealing with the inability to route all clock signals over the high speed clock networks of the Digilent board; the choices they made for connecting RAM and clocks did not support enough clock buffers to get all the required clocks on the same side of the FPGA chip in the dedicated clock lines. 

The DDR3 RAM on the new board is handled by the hard processor (ARM) side entirely, so I am not involved at all in memory interface controllers or the details of RAM access. Simple memory mapped transactions across the AXI bridge will let me read and write to RAM with simplicity. 

While I still have transactions between the Linux hard processor side and the FPGA side, there are no external wires carrying signals nor need to do voltage level shifting between FPGA and Arduino. Thus the inherent reliability is higher. 

NEGATIVE CONSEQUENCES

I have built an interface level converter board that converted some few signals to +5V for connection to the Arduino which managed the SD card for disk cartridge images and the user interface LCD. Now that everything is handled on a single board which uses LVCMOS 3.3V signaling exclusively, I have to reroute those signals to the new voltage levels. 

I have some sunk cost in the Digilent board, the Arduino Mega 2560, the SD card daughter board and the LCD interface daughter board as well as VHDL logic I built for this configuration and C code for the Arduino. Some of the logic I built is unnecessary as it was used to exchange transactions between the old boards, but I will need to rewrite parts of my FPGA logic and create a new interface program for the Linux processor side. 

Since I was challenged getting the old approach to work reliably, this is no longer an issue. My biggest costs (other than the new board and LCD interface) is in the learning curve to switch over to Quartus toolchain and master the new type of board. 

Terasic has a touch screen LCD daughter module which I ordered and will use for the user interface. While the board has an HDMI port, that is way too powerful for what is needed and would involve quite a bit of coding just for good video. Using this touch screen module should ensure a fast development path. 



Thursday, January 12, 2023

Time to shift to entirely new board and toolchain for the virtual 2315 project

RECENTLY AWARE I HAVE ATTENTION DEFICIT HYPERACTIVITY DISORDER

Those who look back over my blogs and notice the huge number of projects that I got partway through and then wandered on to new things will have spotted this major symptom of ADHD. After 71 years of ignorance, I discovered and was diagnosed with ADHD. 

Now painfully aware of the risks of wandering on to another more novel project or challenge, I was determined to finish the virtual 2315 cartridge project and then wrap up the renovation of the IBM 1130, ignoring the temptations of shiny new endeavors. Other than the holiday and family times, I have been working to get the system working, failing with great frustration. 

TIME TO DITCH VIVADO AND XILINX FPGA BOARD, TAKING A NEW DIRECTION

I have had many days fighting erratic behavior of the Vivado toolchain. As an example, when programming the FPGA I will now receive dozens of "Background task busy" pop up errors. Reboots and every other change has no effect on this. 

Yes, this might be errors on my Lenovo laptop, on the Windows operating system, or corruption on the Xilinx Vivado installation, but nonetheless it is almost completely blocking any forward progress testing and debugging my design.

Opening the integrated logic analyzer cores might produce error messages about mismatch of port numbers, yet the files that actually drive Vivado are hidden beneath a byzantine and flaky user interface. With no documentation to highlight which files are actually controlling behavior and where they sit, it is extremely difficult to figure out and correct such cycles of failure. I suppose this is obscure in order to protect intellectual property, but it makes the tool pretty useless to me right now.

I can run the synthesis, suspecting that the memory interface is not working thus bringing out the signal that shows initial calibration is complete; this proves that the memory interface did not come ready. I make minute changes just to show via LEDs that each of my three clocks are indeed clocking, when the calibration magically works properly. 

In several months, I have had virtually zero of my logic at issue, but spent inordinate time with odd behavior inside memory interfaces, clock modules and other IP I don't control. The fact that sometimes I see correct operation shows me that I am writing and reading from memory, and receiving SPI transactions coming from the Arduino, but I have failed to get a solid transaction sequence that loads a sector and then retrieves it successfully.

I refactored the logic many times, including overly conservative interlocked approaches and very formal state machine construction. I pored over the list of errors and warning messages. I set up integrated logic analyzers to try to spot where things are going awry.

I kept finding sporadic errors before everything degraded to where I can't get logic analyzer cores working at all. It is time to toss in the towel since I am not moving forward on the project and don't see the likelihood that something will break the logjam any time soon. 

NEW PLATFORM CHOSEN

The existing design split the work into two sides - an Arduino that would manage an SD card for the cartridge images and the user interface, talking over SPI to the FPGA board that would interface with the disk drive. The capacity of a 2315 cartridge requires 16 megabits of storage, beyond the block RAM capacity of the FPGA chip. The board I had chosen had plenty of DDR3 RAM to hold the image. 

The new platform for development is a Terasic DE-nano, which uses an Altera/Intel system on a chip. It gives me a hard processor with twin ARM cores running Linux plus an FPGA. The FPGA can easily access the capacious DDR3 RAM from the hard processor side. The board provides an SD card socket accessing from the Linux side and an HDMI output for an improved user interface. 

This makes use of an entirely different toolchain in addition to giving me an integrated single board platform for the entire project. Of course I will have the learning curve of adopting the new tool and FPGA chip, but I am expecting that I will be free of the capricious and frustrating issues of the last few months. Sadly, my past experiences with Vivado and Xilinx based boards had all been good - the IBM 1130 interface expander and the Xerox Alto disk image extraction tools, for example.