Saturday, January 25, 2025

Built newest version of 5806223 SLT card for 1627 plotter support on IBM 1130

BUILT NEWEST SLT CARD WITH INTEGRATED SLT CONNECTOR

First I used my hot air rework to remove all the components from the earlier PCB, including the gold spring contacts. Since that card had too narrow a spacing between the two connectors, I had to build another version of the PCB. Too, the spring contacts were so wide that they were hard to line up properly when soldering to avoid shorts. I found and bought some new narrower contacts and redid the PCB. 

I soldered all the components onto the new version of the card with its correctly spaced integrated connectors. All the spring contacts are in place on the card so that it can connect to the pins in an SLT card socket, either on my bench tester or more importantly in an IBM 1130 card compartment. The solder pad helped pull the contacts into good positioning on the edge of the card due to the surface tension while the solder was molten. 

I installed housings on the card to cover the contact and to guide the card into the SLT card slots. It fit well into the bench testing socket now. I conducted some tests to verify that the signals were delivered reliably. 

QUICK BENCH TEST TO VERIFY NO SPURIOUS IL3 TRIGGERING

I leveraged the test setup I had in place from testing the prior card versions. The key here is to see that IL3 is not triggered by a power on reset, only when an XIO Write is issued with at least one of the motion command bits set. 

CARD NOT LATCHING INTO SOCKET (SLOT)

IBM SLT cards lock into position in the slot due to some fine features of both the board (backplane) pins and the card contacts. I was hoping that the pressure of the spring contacts alone would give a reasonable anchor for the card but it slides up and out of the slot when released. 

The pin on the board has a notch or hook at the end, looking a bit like a crochet hook. The contacts on the SLT card have a bend that will fit into the notch to hold the card in place. 

Notice that the IBM contacts are bent at the rear and have their active contact point towards the front of the card where it enters the board socket. Thus the pin with its hook slides over the end of the contact and resists its removal. This produces a nice 'click' as the card is inserted and requires a bit of force to begin the extraction of the card. 

The replacement contacts I use don't have the same shape, one that was designed to interface with the hook on the end of the board pins in the socket. My contacts and the pin both deflect a bit as I push the card into the socket but they don't reach a position where they are held down, instead the deflection acts to push the card back out of the slot. It is not a strong force, but even normal vibrations would cause the card to walk out of the socket. I will have to figure out an anchoring method if I am to use these alternative boards and contacts. 

FOUND CONNECTOR FOR THE REAR OF THE IBM 1627/CALCOMP 565

I located a Cannon connector that will fit with the back of the plotter. I can manufacture a cable that connects this connector to a smaller 8 pin connector that will plug into the rear of the 1130 gate A with a matching set of connectors I purchased for that purpose. Three pins on the Cannon connector are powered by AC from an SMS power card, the reset are signal lines going to my new 8 pin connectors.

The bill of materials for the replacement 5806223 card is around $85 not counting tax and shipping for the components. The cost for the new cable with its connectors will be around $50. When I solve the anchoring problem, hopefully without having to grab the IBM contacts from donor SLT cards, I can proceed to test this new card in the 1130. 


Kick off simulation at correct point to observe one sector being read and archived

PREVIOUSLY DOCUMENTED WALL CLOCK TIME WAITING FOR READ TO START

Part of the cause is the timescale for the disk drive itself, which must be modeled accurately in order to verify the correct behavior of my design. One rotation of the disk takes 40 milliseconds, with a sector pulse occurring once per 5 milliseconds and the sector which will be read spread over 10 milliseconds. With an approximately 500,000 to 1 ratio for simulation, this means that the sector itself runs for 1 hour and 24 minutes on my laptop.

My logic begins reading sector 0 thus the time it takes spinning the disk until sector zero begins might be as much as five hours and thirty three minutes if the simulated disk rotation is just inside the start of sector 0 when my logic starts looking. 

In addition to all of this, there is a delay of about 120 microseconds for the RAM controller to initialize and calibrate (at its "fast" setting to decrease simulation delay), which means one minute of wall clock time before my logic comes out of reset and begins looking. In comparison to the other delays, this isn't very significant however it means that if the simulated disk sector signals began with sector 0 at the start of simulation, it would be 1.2% through the sector by the time my logic starts up. 

The logic for identifying the sector which is under the read/write head is to reset the upcoming sector count to zero when the index pulse occurs, once per 40 milliseconds, then advance the upcoming sector count on every second sector pulse (two sector pulses per 10 ms sector). The sector begins with the first sector pulse after the index pulse and on every second sector pulse afterwards. 

On the actual disk controller logic, it requires one full rotation of the disk before the sector counter becomes accurate, having seen the index pulse. My logic starts with the upcoming sector count set to zero by design, but that will give a wrong sector number if the index pulse I generate in the testbench is not synchronized to happen with the proper sector pulse.

METHODS TO SHORTEN DELAY BEFORE WE START READING A SECTOR

One method of shortening the delay would be to start up with faster sector and index pulses to step the upcoming sector count to that it was back to 0 before my logic starts watching for the next proper speed sector pulse. I worked out a set of pulses at 100x speed, thus shortening the wall clock time for a disk rotation to three minutes and 20 seconds. That would allow a reasonable wait to have the disk come around to where sector 0 starts when my logic is watching to begin the read.

An even better method is to set the sector and index pulse generators so that the index pulse is assumed to have happened before the RAM controller finishes initialization and the sector pulse begins shortly thereafter. If I have the pushbutton simulated to start the archiving just before the sector pulse above, then the logic will see sector 0 arriving and start the read with minimal delay. The sector and index pulses occur at normal speeds, with the first index pulse at around 40,720 us into the simulation. 

COMPLICATIONS ON FIRST FEW TRIES

I had to get the logic to trigger on the first sector pulse, which involved logic in the design that was in a starting state and expecting some sequence of changes before it began the read operation. First I drove a very short index pulse to set the upcoming sector number to 0. However, it still didn't start.

Next I looked into the archiver module logic and see that I have to detect a change of sector number signal, where the upcoming sector number matches the target sector for our read, before we begin the read operation. The starting state of my logic has the prior sector number set to 0, which is also the starting state for the target sector, thus no new sector signal. 

After one complete rotation of the disk platter, the prior sector number would have been 3 and the change to 0 would trigger my code. This would work in actual practice, allowing the logic to really sync with the sector coming under the read/write head having seen an index pulse and a chain of eight sector pulses. In simulation that process would be more than five hours of wall clock, a penalty I don't want to incur.

I modified the code temporarily to set the prior sector number at startup to 3, thus triggering the new sector state and the start of the read as soon as the first sector pulse completes. This was my intent in order to watch how a sector read behaves. 

GOOD START TIME FOR SECTOR READ ALLOWING ME TO STUDY LOGIC BEHAVIOR

Expecting a tolerable duration per simulation run, I watched an entire sector read take place, expecting it to take about an hour and a half of wall clock time. Apparently the simulation ratio depends on how rapidly signals change and the number of signals being monitored, as this instead took more than three hours to complete.

The simulation run proved the correct operation of all the logic involved in writing the detected words from the read head into the DRAM. It let me examine the addresses used with DRAM to be certain the data is stored where I expected it to be, the data flowing into the DDR3 chip signal lines, and verify the behavior of all other signals during this time. 

When I was satisfied that the first sector was read properly, I temporarily modified the archiver to begin by requesting sector 1 instead of zero. That required me to wait a few hours to see the read begin at the start of the second sector, but it was a one time test I could do near the end of my simulation testing campaign 

Wednesday, January 22, 2025

A few hours invested to get the disk simulated signals working on testbench

BIZARRE BEHAVIOR OF THE VIVADO SIMULATOR REQUIRED EXPERIMENTATION

In Verilog, you specify delays between signal changes using #nnnnn to specify how many units of time 'n' to delay before taking the next action. The sector pulses occur 5 milliseconds apart and last for 165 microseconds duration as an inverted logic signal. The repeating process in the testbench to generate this was written as:

 forever begin
       BUS_SECTOR_DRIVE_L <= 1'b0;
       #165000000
       BUS_SECTOR_DRIVE_L <= 1'b1;
       #4835000000
       BUS_SECTOR_DRIVE_L <= 1'b0;
 end

What I was finding was 165 us for the 0 portion but only about 500 us for the 1 portion, far from what I intended. It acted as if it simply lopped off the 483 from the beginning of the delay value. 

I resolved this by repeating a smaller delay value and the same signal value to add up to the desired total delay:

    forever begin
       BUS_SECTOR_DRIVE_L <= 1'b0;
       #165000000
       BUS_SECTOR_DRIVE_L <= 1'b1;
       #500000000
       BUS_SECTOR_DRIVE_L <= 1'b1;
       #500000000
       BUS_SECTOR_DRIVE_L <= 1'b1;
       #500000000
         . . .

 This worked properly. I would expect that if a value is too large for the simulator to handle, it would issue some kind of error message to warn me, rather than silently failing and doing so in a way that appears to be a random value. 

EXTENSIVE WAIT TIMES FOR SIMULATION TO MODEL ONE ROTATION OF THE DISK

The drive spins at 1500 RPM, taking 40 milliseconds to turn once. With the 500,000 to 1 time simulating each picosecond, this would chew away on my laptop for five and a half hours. Just a single sector, the minimum to watch the logic archive content to RAM, will require one hour and almost forty minutes. 

See one mistake or make one refinement, the cycle time is about two hours. Very tedious. 

Sunday, January 19, 2025

Slow debugging of entire Diablo Archiver under simulation

SLOW TEST CYCLE DUE TO SERIAL LINK SPEED

The full design for the archiver writes out on the serial USB link at 115,200 baud which imposes additional delays waiting for the simulation to reach some logic point in an upload. Each character emitted takes approximately 87 us. 

Thus, the initial header message (21 characters) ties up 1.83 milliseconds during which the uploader state machine waits before it reads the first word of the first sector. The trailer message for the sector doesn't complete until after 168 ms, at which point we move to the next sector. 

Completing one sector involves eight sectors. We need to wait through four of them to see that the head and sector advance properly and wait through another four to see cylinder, head and sector are set correctly. That take the simulation over 1.347 of simulated duration to complete. 

Since it requires a couple of minutes to start a simulation and then time passes at 2 us per real second, 500,000 times slower, a complete cylinder of upload would take 7.8 days to complete. The full cartridge in simulation would chew up 4.34 years of actual time. 

I can afford to wait to see a couple of words from the first sector be uploaded, but even waiting for a single sector to finish would require almost an entire day of wall clock time. Luckily I had simulated just the uploader module previously which ran fast enough to test key points where sector, head or cylinder numbers are incremented and the proper end of the function when we wrapped up the 203rd cylinder. 

This is because the effort to simulate all the memory interface, FIFO, clock and my user logic cells involves much more computation that a single module of the design such as uploader. Still, I would rather simulate as much as I can before I am trying to debug with the actual disk running and spot errors in real time. I don't want to subject the heads or any cartridge to excess time at risk; I want to fly the heads on each pack for the minimum necessary to archive the contents then shut down the drive. 

UPLOADER DOUBLE CHECKED AS FAR AS BEGINNING TWO WORDS OF SECTOR 0

I could tolerate the time necessary to startup and watch the design read from RAM and upload words 0 and 1 of sector 0, head 0, cylinder 0. This matched the experience of the partial simulation I did yesterday and reassured me. However, I did find a race hazard in the logic during the simulation that I corrected before signing off on the uploader. 

Ironically, it slightly elongates the real time needed for simulating the uploader, with each character on the serial link taking 91us vs 87. 

SETTING UP TO SIMULATE ARCHIVING

I have a test bench that simulates the disk drive operation so that I can test out the archiver logic. While the serial link speed is not a factor here, the physical disk drive timing does impose its own painfully slow simulation rate. 

One spin of the disk platter to get to sector 0 will take 40 ms of simulated time, or 5.6 hours of wall clock time before we read a sector. The sector itself takes 10ms to pass under the head, or 1.39 hours of wall clock to complete. 

If I set up the test bench to position the simulated disk platter at the ideal starting position, I have the chance to see a sector read and written to RAM in under 1.5 hours of real time. Still quite painful but feasible. An entire cartridge being archived will take well over a year on the simulator, which I will not attempt.  

Thursday, January 16, 2025

Uploader function of Diablo Archiver fully debugged

FINISHED CAREFUL SCRUTINY AND CLEANUP OF UPLOAD MODULE

I made extensive runs with the integrated logic analyzer (ILA), ensuring that the logic all behaved as I wanted. Based on the results, I made a few optimizations and changes to the design, until it looked as good as I wished. 

I verified that it was correctly stepping through all 321 words of the sector, fetching the dummy 322 word which provided feedback on any sync or data word ECC errors encountered, and that it advanced properly through every sector, head, and cylinder to the end of the cartridge. 

Using a log from the Putty terminal program, I captured the output and counted characters as well as lines to cross check that I received all the sectors and status lines. 

POSSIBLE FUTURE VERIFICATION

One step I could take to further verify that all the addressing and RAM operation were correct would be to populate RAM with contents prior to the start of the upload, with each word having its cyl, head, sector and word number as its data value. That way I could examine the output captured by Putty and ensure that we are storing and reading RAM correctly. 

Wednesday, January 15, 2025

Working with integrated logic analyzer to debug Diablo Archiver; found the issue blocking memory controller

MINOR BLIP WITH TOOLCHAIN AGAIN

I had generated an integrated logic analyzer (ILA) with 32 signals to watch. After a first run, I assigned a few more of the 32 and regenerated, but whenever I launched the ILA on the hardware, it showed only the original set of signals. This is yet another of the well known misbehaviors of the toolchain. Fortunately there was a way to force the update.

SPOTTED WHY MY MEMORY INTERFACE WAS SHUTTING DOWN

Since the example runs well but my logic does not, but we use the exact same memory interface and clock wizard settings, I began to look very carefully at any tiny difference. All at once, I spotted the reason my memory was stalling, both in simulation and real life. 

The user can ask the memory interface to accomplish three special tasks - refresh, ZR calibration and self-refresh requests. The designer can ask the memory interface to do a calibration if they believe that errors are occurring which can be minimized by another round of calibration. The designer can also control the memory refresh timing 

Normally the memory interface will generate refresh cycles to the DRAM chips to keep the cells charged, accessing each row and each column regularly enough that the capacitors that form the cell don't discharge. A designer who wants complete control over this can turn off the automatic refresh and instead call the memory interface to do a refresh at specific times. This can position the refresh activity at a time when it won't delay time critical read or write requests. 

These are not normally used by designers, thus I thought I had those requests turned off. However, I though these were active low requests, thus a 0 value would request them. As such, I instantiated the memory interface with constants of binary 1 for these three request lines. I was wrong. Arrrgh. They needed to be set to binary 0. 

I reran a simulation now that the requests are set to 0, and the problem with the memory interface disappeared. I can now debug my logic that calls the memory interface, since it can be trusted to provide memory access. 

WHY UPLOADER NO LONGER RUNS TO COMPLETION

The logic for uploading the disk cartridge contents from RAM writes out the first header, for cylinder 0, head 0, sector 0 then tries to read word 0 of the sector. Nothing is printed and the word address sits at 1 after that point. It was apparently working to completion of the cartridge when it was running at 921,600 baud but after my conversion of the output to 115,200 baud, it stopped. 

I moved the ILA from the dram controller module to the uploader module which is where the failure appears to occur. After wiring all the relevant signals, I generated the bitstream and started up the board. The problem was indeed relative timing. My dram controller module will raise the signal dataready but only for one cycle. If my state machine in the uploader is still busy pumping out the header message, by the time it looks for the dataready flag, it has gone back to 0. 

The solution was to interlock the request for data from dram controller. The caller, uploader module, will keep echoword high until it sees dataready come back. The dram controller won't drop dataready until it sees the echoword request turn off. 

I have a bit of cleanup and optimization to do in the uploader module now that it is basically working. After this, I imagine I will need a working Diablo drive to debug further. 

Tuesday, January 14, 2025

Working with real hardware to test DRAM controller operation

DUE TO SIMULATION ACCURACY DOUBTS, WORKING WITH REAL HARDWARE

I decided to avoid simulation for a while and concentrate on determining if the memory access works on the actual Arty A7 board. This is cumbersome due to the time it takes to generate and load the FPGA bitstream, as well as the contortions required to introduce signals to carry upwards from interior modules to the top level. 

DEBUGGING THE UPLOADER WHICH APPEARS TO STALL TRYING TO READ FIRST SECTOR

The Uploader function is transmitting properly at 115,200 baud now, but it stalls after writing the first header message. I routed a couple of signals to the LEDs to see which step of the state machine it has stopped in. Since the uploader appeared to work properly when it was running with 921,600 baud serial communications, it is possible that something related to the logic writing over the link is the cause, but it equally could be failure of the DRAM controller due to the same issue I am seeing in simulation.

My first run showed that the issue is with the DRAM controller, as the uploader function raised the echoword signal to request a word from RAM but we are stalled waiting for the confirmation. This is consistent with the simulation, showing the memory interface shutting down about 900 us after it initializes. 

I made a few tweaks and tried again, this time displaying LEDs to show if it was the first or second read that had the stall. In the simulation, one read is reported back as complete and the second never gets the data valid signal from the memory interface. This run showed me that we didn't get past the first request. 

GOING BACK TO RUNNING THE EXAMPLE PRODUCED BY THE IP GENERATOR

When I first set up the memory interface, it produced an example that ran a traffic generator into and out of the memory, validating its behavior. I installed that on the board and ran it, believing that it worked properly because of the visual indications. The example used the four LEDs to show when 

  • the clock generator locked onto the right frequency
  • the memory interface completed initial calibration
  • a flash per second as a cue that it was operating
  • a light to indicate that data didn't match when read back
I saw the first two light, the third blink steadily and the fourth stay dark, which I took as evidence that it was performing its writes and reads with no errors. HOWEVER, now that I see the memory interface shutting down while leaving init_calibration_complete up, I had my doubts. 

I installed an integrated logic analyzer into the example program and ran it again. It would show me the status of various signals between the traffic generator and the memory interface, which should be actively changing if it is indeed operational. 

As you can see, it was alternating reads and writes without any errors - it always came back with the data that had been written. Each time I took a snapshot with the integrated logic analyzer (ILA) I had a different pattern of clear activity from the memory. 

Since this was generated with the memory interface details and clock speeds that I chose for my project, it satisfied me that I should have a usable memory. Now to shift back to my own design, but with an ILA included so that I can debug better. 

Continuing to test logic of the Diablo 2315 Archiver - RAM initializing but still battling with Vivado toolchain

INSTANTIATED IN HARDWARE IN ORDER TO SEE CALIBRATION LED LIGHT

As evidence that I have everything working on the DDR3 memory, even though I haven't simulated successfully, I instantiated my entire project and loaded it onto the Arty A7 board. It is set up to light one of the four LEDs when the init_calibration_done signal is asserted, which will validate that all the clocking and reset and other basics are correct. 

The light went on! I have working RAM now. The logic won't move on from that initialization because the design depends on reasonable inputs from an actual Diablo disk drive, which is not yet connected, but I did gain a bit of confidence from this simple test. If I set the switch to the upload direction and push the button, I saw the USB activity LED flashing. 

I connected to it via a terminal program to see what comes out when I push the button. The link is almost reliable but every once in a while the output looks a bit wonky. I will drop the data rate down from the 921,600 baud of the module I borrowed to 460,800 or 230,400 or 115,200 which increases the upload time to as much as 15 1/3 minutes at the slowest speed. 

CHANGED TIMING VALUES FOR THE UART BUT STILL COMING OUT AT 921,600 BAUD

I changed the VHDL for the modules involved but the data continues to stream out at the same old rate. This ultimately was due to the tendency of Vivado to work with cached versions of things. I was able to get it working at 115,200 which allowed me to move forward to check out the upload results. 

BACK TO SIMULATION TO WORK ON DRAM CONTROLLER LOGIC

Grrrr. The FIFO simulations are completely absurd. I have two FIFOs, one with a data count on the write clock side and one with a data count on the read clock side. The write clock count goes up when I have only pulsed the other FIFO and not this one. The second time I pulse the r-FIFO, the rd count does go up, but the first was lost with the spurious w-FIFO write count increment. Most absurdly, a single pulse would not have the FIFO jump from 0 to 2 words. 

Instead of two types of FIFOs, one with read count and one with write count, I created a common type that had both types of counts. My logic only looks at the read count from w-FIFO and the write count from r-FIFO, but both are there and under different signal names. 

It still behaves bizarrely. I dug through the documentation for the FIFO and discovered that for the "accurate count" option I selected, if the FIFO is empty when the first word is pushed in, the counts won't be accurate. Other conditions produce inaccurate counts as well. 

Since my FIFOs are fall-through, where the word pushed in becomes visible to the reading side before a read is issued, I have a different way to check. The valid flag should rise when a valid word is present, due to fall-through. That is what I will use to drive my logic. 

The simulation of FIFOs built on distributed RAM failed to work properly, but when I switched to the native FIFO hardware everything did work as expected. After the simulation, I had all my read and write signals optimized. 

Synthesizing for real hardware showed me that the damned toolkit insisted on using cached versions of the FIFOs even though I had changed parameters when I created them with the IP tool. I had to keep changing names to stop this since no commands would actually force the idiot software to do an actual generation again. Even that was foiled as the toolchain saw the similarity. Eventually I had to delete the cache file manually and restart the toolchain. 

MEMORY INTERFACE SHUTTING DOWN RAM CHIPS DURING SIMULATION

One interesting thing I observed with the earlier simulations was that at some point, the memory interface turned off the RAM chips using signal ddr3_cke which I believed was a response to something being wrong with the memory configuration. However, when testing the write logic I noticed that if I did multiple writes without a long time between them, the ddr3_cke line seemed to stay high. I tested it with more activity and it will still stop about 900 uS after the initial calibration completes. 

To see if this occurs in real hardware, I set up my test on the physical Arty A7 board, wanting to use the ddr3_cke signal to drive one of the LEDs. If it winks out while I am reading data, then there is an issue I have to resolve, otherwise it is just a limitation in the simulation. Unfortunately, this is an output buffer deep inside the memory interface IP and thus I can't read or sense it from within my logic. Nothing in the defined user interface shows me whether this signal went low due to some error state. In fact, there is no signal to indicate an error state. 

I dug through the smallest nits in the messages during the creation as well as searched the paucity of examples for user interface access. Most people who use the MIG are using the memory with a soft processor that they instantiate, thus they go with the more cumbersome and complex AXI interface that the Blaze processor requires. I made changes, changing clocks, as well as switching the power-saving logic that was designed to throttle down power usage while the RAM wasn't being actively used. 

From the detailed simulation capture the last activity I see before the memory interface shuts down is when it is attempting a refresh of the DRAM. The charge on the capacitors in DRAM decays rapidly thus it must be rewritten in the background to retain memory contents. The memory interface is responsible for this, interleaving actual user access to memory with these refreshes. 

However, it was the handling of the first read request that appears defective. Internally the memory is 2Gb organized into 16 bit groups accessed by a row and a column address. Further, the interface is designed to grab eight consecutive words in each access (128 bits) thus we would have eight groups read out for an access. Below is what I see for the first read access:

Following that is a subsequent read access with only receives four groups not consistent with the "Burst Length 8" design of the interface. Note that one of the signals shows invalid values, meaning it is either driven by more than one source or not driven at all, not even hi impedance (Z). 

Something is screwy with the behavior right with the first access, leading to the interface shutting down. This is a lovely world for logic designers using these toolchains, where you have to wade through miles of debugging of other people's work before you can debug an inch of your own work. 

Saturday, January 11, 2025

Finally have the memory controller IP simulating in Vivado - trying to verify my dram controller again

WANDERED TWISTY PASSAGES FOR DOZENS OF HOURS UNTIL SIMULATION WORKED

I could find nothing in the documentation for the memory generator IP nor online that told me what was needed to get simulation generated into the logic of the IP. It showed me the simulation modules for the DDR3 chips and the overall simulation top level logic but as long as the generated IP was built without simulation support it would not work. 

The secret was finally discovered through experimentation. When you use the IP Catalog to generate the IP you want, for example the memory interface logic for DDR3 memory, it does not present any option for simulation. It generates the various simulation modules I mentioned above but has no way to set the IP itself to make use of simulation.

I discovered a Verilog file, one level down from the top Verilog file generated for the MIG IP, which had the parameters for simulation and for fast calibration support. Since these were all produced when I created the IP, I had assumed they would be rebuilt with the same choices but eventually I decided to manually update that second level file.

Next up was the search for a way to regenerate the IP logic without also recreating the second level file. I eventually discovered that resetting the generated products for the IP and then generating anew would retain my modified second level file. Eureka. The IP logic now simulated, toggling the signal lines to the DDR3 chips and the simulation modules appropriately toggling back responses. 

STARTUP TIME ADDS PAIN TO EACH SIMULATION RUN

The memory controller logic interacts with the DDR3 RAM chips performing a calibration, involving many accesses, until it asserts the Init_Calibration_Done signal and begins outputting the 83 MHz user interface clock that is central to my interface between the memory controller IP and my design logic.

This requires simulation for 123 microseconds even with the "Fast" calibration setting for the IP. In wall clock time, it requires more than two minutes before the simulation reaches that point. The complexity of the memory controller IP and other logic in my design imposes about a two minute startup delay before the first femtosecond of simulation time. In all, it is more than four minutes from a click until I see my part of the design begin executing. 

Iterative debugging, while I vary inputs to the design to test corner cases as well as its core functioning, imposes that four minute burden on each cycle. 

I do still have an error in the simulation - the toggling of some of the DDR3 lines by the memory controller IP is running 1000x faster than it should. Not sure whether this impacts the DDR3 simulation module, but I can look at the code to determine whether I need to find the source of this defect and correct it. 

The unexpected good news is that the clock wizard IP was also working properly, not requiring my simulated version, yet was properly producing the 200, 167 and 100 MHz clocks. I don't know why this would have changed but I welcome it. 

CLEANED UP RESET TIMING OF VARIOUS PARTS OF MY DESIGN

The memory interface logic was dropping the user interface reset (ui_reset) way before the init_calibration_done was raised. I needed to hold all my logic until the memory was ready for use, so I had to combine the two signals so that reset for my design lasted until calibration was done as well as while user interface reset was active. 

ADJUSTED THE PARAMETERS OF THE MEMORY INTERFACE 

I have the memory interface logic operating with a clock of around 325MHz, the user interface to the memory interface operating at 81.2MHz (4:1 ratio of the interface), and my general logic operating at 100MHz. This produced the proper timing of signals to the DDR3 chips as well as proper behavior when I drove the interface. 

OBSERVING MY LOGIC READING RAM AND WRITING RAM 

The testbench I had set up when I simulated my dram controller module using my own simulated version of the memory interface was used to drive this new simulation. It set up addresses for cylinder, head, sector and word before requesting to store away a word we had extracted from the Diablo drive or to fetch a previously written word from RAM when we were uploading the archived cartridge at the end of the run. 

The logic simulated well using my homebrew versions of the memory interface but I needed to see this work properly with the actual IP. First up was watching a request for a memory read, which I scrutinized on the simulation output. Second was a write request, which was subjected to the same level of diligence. 

Initially I saw my design completing only one read from RAM. Digging in I found that the memory interface was dropping the clock enable to the memory chips, disabling them from accepting my following read requests. I never did figure out why the simulation of the memory controller was complaining and shutting down during the first read process. 

Another issue became apparent. My design depends on the far side of a FIFO seeing the empty flag turn off, indicating that data has arrived from the near side. During the simulation, I saw the write enable turn on to push something into the near side but the empty flag never turned off. Upon closer inspection of the FIFO wizard, the empty flag seems to stay on with up to four words pushed in, which makes my design fail. 

To fix this, my logic looks at the count of data in the FIFO instead of the empty flag. When it is non-zero, I kick off the rd_en signal to pull out the word from the FIFO. Once the FIFO interaction with my main logic is working, according to the simulation, I can move forward.


Sunday, January 5, 2025

Setting up new display stations for Cape Canaveral Space Force Museum

CCSFM DISPLAYS HAD GRADUALLY FAILED

I volunteer with the Cape Canaveral Space Force Museum, which has multiple sites both inside the secure area of the Space Force Station and one outside the perimeter - Sands History Center. Videos would play at spots around the wall of the history center detailing activities from particular launch complexes, but the devices gradually degraded based on cumulative power failures that corrupted their data.

DEVELOPED PLAN FOR NEW DISPLAY STATIONS

We came up with a design for a new display station that would employ Raspberry Pi computers, HDMI based monitors, the open source Pi Presents software and modifications I came up with to make these bulletproof. We have a range of volunteers with varying skills, plus random power outages, requiring that these new systems will work automatically and are impervious to handling errors. 

I developed a printed circuit board that will plug onto the top of the Raspberry Pi (RPi), termed a 'hat' when attached to an RPi. It implements a battery, real time clock and a terminal strip to attach up to five optional switches. Pushbuttons or other sensors can be used in the future to allow visitors to choose content or to activate upon motion nearby. 

The Linux system for the RPi can be set up to use an overlay file system, making the microSD card (that holds the operating system and application software) read-only. No more corruption if the RPi is not shut down properly. Pi Presents is configured to run automatically at boot up. The software runs presentations based on scheduled times we configure, thus the need for the battery protected clock. The RPi has wifi allowing us to connect to the various displays remotely to kick off special events. 

The content is stored on a USB thumb drive, tailored to each station. I set up the RPi software to detect when the drive is plugged in or removed. Pi Presents is restarted for those events and we run a default show that indicates the lack of content if the thumb drive is removed. 

I used an earlier prototype of my board to test out the display station, one that had additional circuitry that is unneeded. Before I decided to use the overlay file system, I was going to use supercapacitors to provide power for an automatic graceful shutdown of the RPi Linux when input power was lost. It had charging circuits as well as the power fail detection and triggers into the RPi to run the shutdown command. The new PCBs are here and the remaining components for all the stations will arrive later this week. I expect we can have them all operational within a week.