Tuesday, December 24, 2024

Finished simulation of the uploader function of Diablo 2315 archiver; building up DDR RAM controller

SUCCESSFUL SIMULATION OF THE UPLOADER

This function walks through the 2315 cartridge, reading each sector in order and writing the values out over the USB serial port. The data is transmitted in ASCII as ordinary text at 921600 baud. Each sector begins with a text line giving the cylinder, head and sector number, then 321 lines with the data values and a trailer line of text showing whether there were errors detected reading the sync or data words in the sector. 

The major pacing issue is the serial link. A complete 2315 cartridge will be transmitted in just under two minutes. I plan to capture the text that is transmitted and use it to convert the data into files in both IBM 1130 Simulator format and Virtual 2315 Cartridge Facility format. 

BUILDING THE DRAM CONTROLLER FUNCTION

The intellectual property provided by Xilinx includes a DDR3 controller which controls the 128MB physical DDR3 RAM chip on the Arty A7 board. It offers an interface called User Interface (UI) to drive read and write activity from my design.

The controller requires two input clocks at 100 MHz and 200 MHz, then generates a UI clock at 66.6667 MHz which must be used to interact with the UI. My general logic uses the 100MHz clock of the Arty board for most of its functionality.

This means that I have to accommodate different clock domains. These introduce metastability risks as well as timing risks. Flipflops whose inputs change too close to their clock edge can enter an intermediate state, neither 1 nor 0, which can persist. Logic has requirements for minimum setup and hold time for inputs to ensure correct outputs. 

Metastability is addressed by placing a chain of D flipflops such that even if the first one or two flipflops enter a weird state, the signal is cleanly 1 or 0 by the end of the chain. Some IP that deals with multiple clock domains will also include metastability FF chains inside. 

In Verilog, this is accomplished by code like this:

  always @ (posedge clock)

  begin : LOGIC // block name

    metadata[3:0] <= {metadata[2:0], OUTSIDE_INPUT};

    signal <= metadata[3] == 1'b1 ? choice1 : choice2;

I instantiated FIFO (first in-first out) IP that was designed to move data across clock domains. Although I don't use them to drive more than a single request for read or write at any time, they solve my timing and metastability risks. 

My logic will push a new data word into a FF at 100 MHz, my general clock, and other logic operating under the UI clock will pull that word out of the FF at 66.6667MHz and write it to the DDR3 controller. When reading back for the upload, the logic running under the UI clock will push the data read from the DDR3 controller into another FF where my logic running at 100 MHz will pull it out. 

To make life simple, whenever data appears at the UI clock end of the first FF, I know it is a write request. Requesting a read requires a signal to go into the UI clock domain from my 100 MHz domain. In addition, the address for the RAM read or write is generated in my 100 MHz domain so it too must get over to the UI clock domain. I handle the metastability of these signals with Verilog code much like that shown in the example a few paragraphs above. 

The RAM size is so large compared to my needs that I can be lazy and wasteful with the benefit that this allows simple straightforward designs. The address ranges for cylinders and for words inside a sector are not even powers of two. I form an address using the eight bits of cylinder, but the value only ranges from 0 to 202 leaving part of the address space wasted. Similarly, the nine bits of word number range from 0 to 321 (322 including the error status I record) which also leaves locations 323 to 511 unused in each sector. 

Also, the DDR3 interface inherently provides bursts of eight blocks of 16 bits for a total width of 128 bits for every read or write. I am only using the first block out of the eight. This means that I will issue eight writes for sequential words in a sector when I could have just loaded the eight blocks then issued a single write. Again, I have the room thus I am skipping the logic to properly use the eight blocks in a single access operation. 

No comments:

Post a Comment