Sunday, October 30, 2022

Erratic results cast suspicion towards testbed itself or subtle issue


The last few test runs have produced puzzling results. I saw mostly correct results going up the SPI link, but the first value sent was incorrect, then we were off by one for the next 285 words of the 321 word sector, then it repeated the 285th value over and over until the end. The SPI state machine did not complete nor reset.

With one run, I saw garbage values again and the integrated logic analyzer showed that the state machines for driving the SPI link froze after the first word, acting as if the SlaveSelect line was never asserted again by the Arduino master. The way that my state machines are set up, as soon as SlaveSelect is asserted we start over pumping out the first byte of the word to the SPI link module, but that wasn't happening. 


SlaveSelect is a signal that I set and reset from my Arduino code for every word that is exchanged over the SPI link. That is, I assert select, exchange two bytes, then drop the selection line, thus delineating every word on the link. An overall signal, SPItransaction, is asserted to start a multiword transaction and dropped at the end of the 325 word exchange. 

A signal from the Arduino Mega 2560, with 5V logic levels, is converted by my level shifter MOS transistors to the 3.3V levels of the FPGA board. My Arduino itself produces both 5V and 3.3V to power the two sides of the level shifter. Previous oscilloscope probes showed very good swings of logic levels, so that when the +5 dropped to near 0 on the Arduino side I would have the +3.3V level on the FPGA side drop near zero. 


I don't have the scope here where I am testing, but that is the next step in investigating this weird behavior. A number of possibilities exist:

  • The Arduino output may not drive low enough to produce an asserted low level at the FPGA
  • The level shifter may be misbehaving
  • Resistance in my makeshift wiring and connections may be producing invalid logic levels at the FPGA input
  • The FPGA may reach a metastable invalid state if SlaveSelect changes near a clock edge
  • Timing issues in the routing on the FPGA chip may produce state machine errors or other logical 'farts'
I have synchronizers on every exterior signal, including SlaveSelect, which should have reduced the change of a metastable state to extremely low odds, especially to occur as often as it seems to be.

I can use an oscilloscope to validate the voltages appearing at the FPGA input pins, ruling out the first few potential causes or directing me to corrective action. 

If the issue is timing, I will have an extra frustrating road ahead. The timing report shows continual failure to meet timing, driven by error messages about inability to place the clock buffer and clock generating resources in the same portion of the FPGA chip. It forces me to override the conditions. I am loathe to allow this but the microscopic detail level necessary to work on this, particularly as it involves Intellectual Property (the memory interface) that I didn't write and which is in Verilog - a language I don't know. 

Saturday, October 29, 2022

RAM retrieving data but not yet transferred up SPI link - good progress


My two integrated logic analyzer cores, one operating at the 4:1 speed of the memory interface used to clock in requests and grab data from memory, the other operating at my general logic frequency, were useful in spotting the state of signals as my logic dealt with SPI link requests to load and unload data from a target sector of the virtual cartridge image as held in the DDR3 RAM on the FPGA board.

I was able to see that the data was properly written into the memory interface and that information came out later when reading the same addresses. I will need this facility both to feed the SPI link during unload operations for virtual cartridges but also to feed the signals into the disk drive controller when we are simulating the head signals if it were a real cartridge spinning on the drive. 


From the analyzer I could see that we only had valid data from the memory interface for the two clock cycles when the app_rd_data_valid signal is asserted, telling us we have good data. It then reverts back to the wrong data. Thus the timing of when we latch in the app_rd_data bus is critical to successfully getting memory contents out to the functions that need them. 


The fix seems pretty straightforward, so I will implement it and enter a new round of testing. Ideally, we will grab and hold the memory contents, pass it properly to the SPI link state machine, which will properly load it into the SPI slave link module itself where it will be properly clocked up to the Arduino. 

Tuesday, October 25, 2022

Bizarre clock domain discovered by logic analyzer core - digging deep


My design has multiple clock domains which brings with it the challenge of synchronizing signals going across domains. The board has a 100MHz hardware clock (also 12MHz but I didn't use that) which generates various clocks for the RAM controller and another (50MHz) used for my general logic. The controller gets 100MHz and 200Mhz, then it will generate a ramclk of 25MHz for driving my memory interface logic. Finally, the Serial Peripheral Interface (SPI) link has its own 4MHz clock which runs intermittently. 

We therefore have five active clock domains driving logic, plus two hardware clock domains one of which generates most of the others. Even if two of the domains would be at the same frequency, they are not in phase nor do they have aligned clock edges. 

Any external signal such as the SPI data lines but also all the signals from the 1130 disk drive, should be synchronized as they might otherwise be changing right near a logic clock edge leading to metastable state errors. As such I had a pool of synchronizers to make sure very signal in a particular clock domain changes only at clock edges. 

Too, I needed to reset various state machines and elements in a proper sequence, thus there are reset signals generated in steps - original, a FIFO clearing state, and a reset for the logic running under the ram clocks. In that ballet of startup steps, I had an issue which resulted in my main ram handling state machine stalling. This didn't occur in the regular simulation, but when I did a functional simulation with the post-synthesis design, I was able to dig out the issue previously. The fix was easy even if finding it was not. This was several rounds of testing ago, but interesting to understand the wrinkles involved in this sort of project.


When I select signals to watch with the integrated logic analyzer cores, the Vivado tool chain will determine the clock domain and build an analyzer core for each clock domain which has signals to monitor. I decided to watch the SPIbyteout bus signal which is the eight bits that are sent to the SPI protocol module to shift out to the Arduino. This was the last point outside the SPI module logic and thus I could monitor to see whether I was passing the RAM values properly.


The toolchain built a third logic core for the 100 MHz clock domain. That is only passed into the memory interface module and not involved in any of my logic. There is no way that the bus value I want to monitor should be tied to that clock domain. This suggests some subtle error which is the root cause of my difficulties but it is a very opaque sign. 

The logic that is generating SPIbyteout is clocked by the 50MHz clock domain but also tests an unsynchronized input from the SPI clock domain (SlaveSelect). That is a flaw that I have to correct, altering the logic to remove any reference to SlaveSelect or at least synchronizing it properly. In spite of the issue I see, I cannot imagine how that produces the errant clock domain assignment by Vivado. 

Monday, October 24, 2022

Data returned from the response FIFO correctly, not getting up to Arduino properly


I set the internal logic analyzer core to detect when I was reading the sixth word of the sector, by triggering on the return pattern 0006 and then watching all the related signals such as the SPI state machine.

The value 0006 is clearly detected and the state machine moves forward, thus my problem lies somewhere past the FIFO. It may be in the timing of passing the data word to the SPI outbound routine, in the way that the message is encoded on the SPI link itself, or on how the Arduino routine is detecting the results. 

I will reimplement with new watched signals to view the handoff from FIFO to the SPI out routine and inspect the word passed to the SPI logic. It would be wonderful if I could directly monitor the SPI link from a logic analyzer core, but the SPI clock is not continuous thus I can't start a logic core that is driven by the SPI clock. All signals in that clock domain, MISO, MOSI and SCLK itself, are thus inaccessible by the internal analyzer cores. 

RAM writing and reading good, data not getting back to SPI link state machine correctly


My load transaction stored the ascending integers in the word addresses of the sector, e.g. 0001 for word 1 and 0002 for word 2. The unload transaction read the same sector back and shipped the value back on the SPI link.

While the upstream data appeared to be gibberish, the values coming from the memory interface was indeed the same ascending integers as I had written and they were properly set up in the data in field for the FIFO that would transport that result back to to normal FPGA clock domain for use by my SPI link. 


The data is set up properly in the FIFO that transports results. I will now focus on watching the FIFO operate and judge the correctness of the data returned on the regular clock side of that FIFO. If that is good, I could still have a problem capturing that into the SPI link state machine for the outgoing byte that is sent up to the Arduino. 

I reconfigured the logic analyzers to focus upon that data I need to watch for the response FIFO and my SPI link state machine. This does cross clock domains, which means I can't watch both the data entering the FIFO and the data entering the FIFO in the same logic analyzer core. I can, however, trigger on the FIFO empty flag to at least watch each result being presented in my FPGA logic clock domain. 


I learned of a mainframe and some related hardware that was at risk of being scrapped as a person nearby had to clear out a storage unit. He was no longer interested in restoring the system and rents for the storage space had doubled suddenly. 

Today I had a small moving business I worked with previously meet me at the storage unit and haul those boxes to my workshop before the end of October. It is an IBM Z9 BC mainframe, almost 1,700 pounds which was a beast to get rolling up and down ramps for transport, plus a 3490E tape system and some ancillary parts. The tape unit was no walk in the park either, just easy in comparison to the behemoth. 

IBM Z9 Business Class mainframe

IBM 3490E 36 track tape cartridge drives

Thursday, October 20, 2022

Stripped down ram verify logic is working! Now to back port the changes to my main design


I set up the memory interface and clock modules, two FIFOs just as were used in my full design, but stripping essentially everything else away. I will drive it with the four pushbuttons on the Digilent Arty S7 board, receiving feedback from the four monochrome LEDs and two tricolor LEDs on the board. 

I set up the left two buttons to access different addresses, writing a target but different data pattern to each. The primary location was set to x1234 repeated to fill 128 bits, while the secondary location was set to x5A0F repeated to fill 128 bits. The right two buttons triggered a read of the two selected addresses, then when the memory access was complete it compared the output from the read with the target data values.

If the primary location read did not come back with x1234, the left colored LED turned red. If the result matched, the color became green. The right colored LED would turn green if the secondary location returned x5A0F else it would turn red.

When the board initialized, all four monochrome LEDs were lit. Pushing the four pushbuttons turned on just the LED associated with that button, as evidence it was detected. Further, the left tricolor LED would be turned blue if we wrote the x1234 to the primary location and the right tricolor LED would be lit blue if we had written x5A0F to the secondary address. 

If the read was begun to either primary or secondary, but the memory didn't complete returning a value, the tricolor LED would be off and the relevant monochrome LED would be on indicating an incomplete read.


After starting up the FPGA board, I observed four monochrome LEDs lit and both tricolor LEDs dark. I began by attempting a read of the locations where I had not yet written data values. The third and fourth buttons did a read of the primary and secondary address. In each case, the related tricolor LED was bright red. 

This told me that I got something back, the read mechanism completed, but the returned value was not the expected x1234 or x5A0F. That was expected at this point.

I then pushed the left button which gave me blue on the left tricolor LED. I pushed the second button and the right tricolor turned blue. This was the indication that I had written the target data to my primary and secondary addresses. Again, as expected.

The final step was to push the third and fourth buttons. In each case, their tricolor blazed a glorious green as the returned value matched our expectations. Any number of reads would result in green status and I could interleave as many repeat writes with the left buttons without causing a red status light.


The changes are for the most part in the intellectual property (IP) modules I used not in my own logic, but without discovering the combinations of settings for all of them that would produce implementable and correctly operating logic, my own efforts would go nowhere.

The clocking scheme, the clock MCMM module and way it was organized was a key part of driving the memory interface at the right clock rates. This included as well a change to the constraints file to override an error that would otherwise block implementation from completing; that change was discovered from Xilinx tech notes after exhaustive google searches.

The memory interface too required its particular set of parameters. Since the number of parameters for the memory interface is large, in addition to the substantial number of clocking alternatives, I was not going to get anywhere with a random walk of changes. Sadly, there was no clear example in VHDL to follow either. Boo to both Xilinx and to Digilent for laziness. 

The change also converted the memory interface from the 2:1 mode I originally used to the 4:1 mode, which necessitated changing a bit of my own logic. When reading DDR3 RAM, the memory outputs eight bursts of 16 bit words for a given read request. Correspondingly a write will have to send eight 16 bit words to RAM for 8 contiguous word addresses. 

In the 2:1 mode, the generated RAM clock from the memory interface to my logic operates at 1/2 the rate of the memory. I therefore will be presented with half of the memory output in one of those cycles - 64 bits or four words, the other half of the eight bursts comes in a second cycle. My logic had to write the first half, bump the memory address by four words, then write the second half of the full burst across two cycles.

In 4:1 mode, all eight words are delivered or sent in one cycle, because the RAM clock operates at 1/4 the speed of the internal DDR3 chip operation. All eight bursts are accomplished with that one cycle from my logic. This was actually a simplification, as I had to only address the bottom address of eight words and send all 128 bits at once to write; receiving gave me all eight words at once. 

If I haven't missed anything else, I should be able to simplify my ram access logic in my real project, convert the clock and memory interface IP to the magic parameters, then move ahead debugging with working RAM assured. 

Wednesday, October 19, 2022

Battle on two fronts but still bashing away at memory controller on FPGA board


In addition to the long term war trying to sort out how to get the memory interface system of the Vivado toolchain to successfully drive the DDR3 RAM on the Digilent Arty S7 board, I am fending off covid-19 virii.

Eleven days ago I got my latest booster shot plus the yearly influenza vaccine, but I didn't get it early enough. Monday both my wife and I began to feel upper respiratory congestion, which at first we ascribed to seasonal allergies, but by the evening it was clearly more than allergies. That had been our first thought since we both had the symptoms essentially simultaneously, which is typical of allergies.

Because the antiviral Tamiflu must be taken early in the illness to be effective, we decided to go to a clinic on Tuesday morning to get testing - expecting it was either a cold or the flu. The test swabs were jointly tested for flu and covid. We were floored to learn that we both tested positive for Covid. Evidently we were exposed at the same time, perhaps at a doctors visit she had the week prior for a checkup on her eye surgery. No way of knowing, of course.

There are antivirals for Covid, akin to Tamiflu. We were prescribed one and began taking it immediately. Hopefully we started it rather quickly in the course of the illness and it, plus the partially activated vaccinations we recently had, will lessen the severity and shorten the duration. 

This will keep me out of the shop (and cooped up in quarantine) for a few more days before I can venture out with masks on. 


I stripped down my logic to a bare minimum that will simply write and read from the memory interface based on pushbuttons on the fpga board. It will light up to show me that the write and reads were started and use color LEDs to tell me if the returned value matches what was written. 

Saturday, October 15, 2022

Wading through error messages trying to resolve memory interface issue


[Place 30-172] Sub-optimal placement for a clock-capable IO pin and PLL pair. If this sub optimal condition is acceptable for this design, you may use the CLOCK_DEDICATED_ROUTE constraint in the .xdc file to demote this message to a WARNING. However, the use of this override is highly discouraged. These examples can be used directly in the .xdc file to override this clock rule.

< set_property CLOCK_DEDICATED_ROUTE BACKBONE [get_nets ddr3clock/inst/clk_in1_clk_wiz_ddr] >

ddr3clock/inst/clkin1_ibufg (IBUF.O) is provisionally placed by clockplacer on IOB_X1Y26

mymemory/u_cartmemory_mig/u_ddr3_infrastructure/plle2_i (PLLE2_ADV.CLKIN1) is locked to PLLE2_ADV_X1Y0

ddr3clock/inst/plle2_adv_inst (PLLE2_ADV.CLKIN1) is provisionally placed by clockplacer on PLLE2_ADV_X0Y0

The above error could possibly be related to other connected instances. Following is a list of 

all the related clock rules and their respective instances.

Clock Rule: rule_pll_bufg

Status: PASS 

Rule Description: A PLL driving a BUFG must be placed on the same half side (top/bottom) of the device

ddr3clock/inst/plle2_adv_inst (PLLE2_ADV.CLKFBOUT) is provisionally placed by clockplacer on PLLE2_ADV_X0Y0

ddr3clock/inst/clkf_buf (BUFG.I) is provisionally placed by clockplacer on BUFGCTRL_X0Y5

Clock Rule: rule_pll_bufhce

Status: PASS 

Rule Description: A PLL driving a BUFH must both be in the same horizontal row (clockregion-wise)

mymemory/u_cartmemory_mig/u_ddr3_infrastructure/plle2_i (PLLE2_ADV.CLKOUT3) is locked to PLLE2_ADV_X1Y0

mymemory/u_cartmemory_mig/u_ddr3_infrastructure/u_bufh_pll_clk3 (BUFH.I) is provisionally placed by clockplacer on BUFHCE_X1Y7

Clock Rule: rule_bufh_bufr_ramb

Status: PASS 

Rule Description: Reginal buffers in the same clock region must drive a total number of brams less

than the capacity of the region

mymemory/u_cartmemory_mig/u_ddr3_infrastructure/u_bufh_pll_clk3 (BUFH.O) is provisionally placed by clockplacer on BUFHCE_X1Y7

Clock Rule: rule_bufhce_mmcm

Status: PASS 

Rule Description: A BUFH driving an MMCM must both be in the same clock region

mymemory/u_cartmemory_mig/u_ddr3_infrastructure/u_bufh_pll_clk3 (BUFH.O) is provisionally placed by clockplacer on BUFHCE_X1Y7

mymemory/u_cartmemory_mig/u_ddr3_infrastructure/gen_mmcm.mmcm_i (MMCME2_ADV.CLKIN1) is locked to MMCME2_ADV_X1Y0

Clock Rule: rule_mmcm_bufg

Status: PASS 

Rule Description: An MMCM driving a BUFG must be placed on the same half side (top/bottom) of the device

mymemory/u_cartmemory_mig/u_ddr3_infrastructure/gen_mmcm.mmcm_i (MMCME2_ADV.CLKFBOUT) is locked to MMCME2_ADV_X1Y0

and mymemory/u_cartmemory_mig/u_ddr3_infrastructure/u_bufg_clkdiv0 (BUFG.I) is provisionally placed by clockplacer on BUFGCTRL_X0Y0

Nowhere, and I mean absolutely nowhere, does the IP for the memory interface give any spot where I can place these elements - the toolchain is doing this and then throwing a fit about its placements not following the rules. Tonight I will drink heavily, tomorrow I will try to dig into the lowest level internal details of the FPGA chip to understand what MMCME2_ADV_X1Y0 and BUFGCTRL_X0Y0 and BUFHCE_X1Y7 are. 

FPGA issues are in the clocking setup for the DDR3 RAM memory interface - once again not in my logic


I continue to monitor the signals I am presenting to the memory interface, a bit of intellectual property provided with Vivado that manages the DDR3 memory device on the Digilent Arty S7 FPGA board I am using. This memory interface is provided two clocks and produces a third one which is the driver for most of my logic. 

The memory interface has do perform some calibration of the DDR3 memory which takes so long that it can't be simulated, thus I created a mock memory interface module to link in when simulating. My mock interface behaves according to the documentation, but only as well as I comprehend the specs. 

Everything works perfectly under simulation, even when I do it with post-implementation nets, but I just am not getting the memory to operate properly in real life. I have battled the internal logic analyzer capability until I could watch directly and everything I am producing matches the signal timing diagrams from the documentation but the results from the memory interface don't make sense.


I am now suspecting that the memory interface is not set up correctly. Various web searches have flagged bulletins and notes from others pointing to issues with the clock setup It is a deep rabbit hole to dive into, far down into the gritty details of clock resources, signal routing on the FPGA chip, and the clear-as-mud documentation for the memory interface IP. 

Digilent provided a sample set of files for the memory interface to use with the Arty S7 board. I have located the actual implementation control files (.prj and .ucf) that were produced in Vivado when I generated my memory interface. Comparing the two flags differences and of course those are in the clock parameters. 


Here is the helpful high level diagram for what is needed to clock the memory interface.

High level clocking

Next I have to study the very long list of rules, the first of which are show below:

Some of the rules for the memory interface setup

A number of the rules are for the choice of where to wire the DDR3 lines to the FPGA - but those decisions were made by Digilent when they built the Arty board. Other choices, such as clock frequencies, will be constrained as I only have two real clocks for Arty - 100MHz and 12MHz - thus I would need clock logic added to convert those to the frequencies necessary for the memory interface.


Once I know what is wrong, I can bash along until I am able to get the proper setup configured and the FPGA implementation to match. At that point I can resume testing to see if things work any better. 

Thursday, October 13, 2022

Spent time in the shop working on 1053 typewriter and Virtual 2315 build


The main shaft in the Selectric 1 mechanism holds the rear of the carrier, which slides left and right along the shaft to the various columns of the page. The shaft is keyed and rotates to energize the typeball mechanism rotation, tilt and strike onto the ribbon. 

The shaft on this 1053 had corrosion and pitting along its length, inhibiting the free movement of the carrier for spacing, backspacing, tabulation and carrier return operations. I had been concerned that the pitting might be so deep that I would need a replacement shaft. Fortunately, sanding the shaft smoothed it out enough for free sliding, particularly once I grease the shaft. The remaining pits are small and don't have raised edges.

The shaft is installed in typewriter allowing me to move on to adjusting and repairing the portion of the carrier that moves one column for spacing and backspacing as well as allowing free movement during tabulation until it reaches a set tab position. 


I prepared the connector and wire harness to connect to the Arduino inside the project box. A ribbon cable will plug into this connector and carry the signals over to the connector on my interface board. Another cable runs from the interface board to an adapter with small wire-wrap lines that I can connect to the appropriate signal points of the IBM 1130 internal disk drive electronics backplane. 

Ahead I will connect the wire harness to the Arduino connector block for each relevant signal. The interface board will need to be mounted in place and the adapter for the wire-wrap connections must be secured before I can accomplish the actual connection to the backplane pins. 

Monday, October 3, 2022

Many hours spent fighting with debug cores


I read as much as I could about the integrated logic analyzers and carefully went over the process in the hopes of actually resuming debugging of my own logic instead of the arcania of Vivado. I removed all previous debug probes, generated a version without the logic analyzers, then started over.

I selected signals I thought I should monitor during execution, adding in the input and output from the SPI link itself just to capture the correctness of the raw data transfer. This resulted in three debug cores, one for each of the three clock domains where I was watching signals. The main FPGA logic clock was one, and the DDR3 memory interface clock was the other, with the SPI link SCLK clock as the third.

I got clean implementation and bit files, but every time I tried to program the FPGA board I received an error and had the three debug cores deleted from the programming. The message mentioned configuration options to check or the need to have free running clocks for the debug core.

That should have been my hint, if I thought about it, because the third debug core was tied to the SCLK clock incoming from the Arduino and therefore was not steadily running at the time of programming. There may be a way to set up a debug core to use an external clock that is not continuous, but I didn't bother to struggle through the mountains of documents to discover it. 

After I deleted the one signal which is connected to that clock domain, I was able to regenerate the bitstream, now with just two debug cores. Now on to testing.

Saturday, October 1, 2022

Finally back into shop for testing; fighting with the internal logic analyzer process


I have been away from the shop for an extended period due to a succession of events which needed my full attention. First, my wife attended her 52nd high school reunion and we visited friends and my daughter. 

We had no sooner arrived back in Florida when an old friend of ours stopped by to visit. He is considering buying a large boat to live on, anchoring it in this area. We jointly investigated marinas and various boats until he left.

Of course, by that time Hurricane Ian had formed in the Atlantic and was headed our way. I scrambled to seal up outer fixtures and vents on my home, lay in supplies for a potential long duration without power or water and then raise everything in my shop so that any minor water running on the floor wouldn't damage anything.

I am relieved to say that both the shop and my home survived the passage of the hurricane's center only 10 miles north of us without any damage. Both the shop and my home have impact windows that should stand up to impacts from up to category 5 winds. Ian was nearly Cat 5 as it arrived on the gulf coast of Florida but had weakened to barely Category 1 after crossing the state to reach me on the Atlantic coast. 


I have just begun to test again, attempting to use the internal logic analyzer functionality of the Xilinx chip to debug the SPI link transactions reading and writing data to the FPGA board RAM from the Arduino. Spent time banging my head against the table figuratively as I attempted to get the logic I created, the logic analyzer and the memory configuration file for the onboard flash memory to be consistent. 

What would happen is that I would see the activity going on but the logic analyzer insisted none of the signals were varying. I believe that I had them out of sync, thus the analyzer was looking at FPGA lookup tables or other resources which were not used in the latest implementation, given that the software makes those assignments fairly dynamically on each run of the toolchain.