Wednesday, February 22, 2023

Evolving design as I learn more about SoC communications

ADDRESSING DETAILS ACROSS THE HPS-FPGA BRIDGES

One might naively think that addresses are a straightforward and universal feature of the SoC system, but there are in fact very many distinct address mappings that you will deal with in this type of device. Some can be modified to a degree while others are fixed aspects of the hardware design of the SoC. 

The Hard Processor System (HPS) is a dual core ARM system with the ability to talk to SDRAM and various on chip memories on the SoC. Its level 3 interconnect has interfaces to the bridges that allow communication with the Field Programmable Gate Array (FPGA) side of the chip and all the devices that may be accessible through the FPGA. 

To begin with, none of the hardware addresses you see from the FPGA or the bridges into the HPS side are virtual addresses. Thus, even if you have a mechanism to read some location in the memory of the Linux image running on the HPS, you would need to translate to the real SDRAM address in order to access it. 

When looking from FPGA side over to the HPS, there are multiple address mappings you will see depending on which mechanism or bridge you use for the access. There is an MPU view, a level 3 view and an SDRAM view. Within the L3 or MPU address mapping, locations are reserved for various hardware devices on the HPS side including the various bridges that communicate with FPGA. Further, one can have a mapping within the subset assigned to a particular bridge which would be recognized on the FPGA side as addressing some particular device or logical signal. 

The most straightforward is the SDRAM mechanism, which sees all 4GB of possible connected SDRAM and has therefore a straightforward addressing map. Not virtual, but if you know the real RAM address that the processor has associated with a location under Linux, you can read or write to it. This must be coordinated with Linux on the HPS which is also reading and writing to SDRAM while you are doing so from the FPGA. 

Boot ROM and onboard RAM on the HPS side do take up some of the MPU and L3 address mapping, but these can be, to a limited degree, remapped in the address ranges. A bit less than 3GB of the possible SDRAM space is visible in the L3 mapping. The MPU mapping sees 1GB of the SDRAM where it maintains cache coherency - thus updates from either side are viewed appropriately by the other (ACP window). This can be changed to some degree to pick the 1GB range which achieves coherency. 

In the MPU, another subset of SDRAM, less than 2GB is visible but not cache coherent, thus one has to be careful about coordinating access from Linux and your bridge. All L3 map addressing of SDRAM is not cache coherent but some may be from the MPU, thus set the ACP window to 'protect' coherency of some part of the SDRAM. 

The hardware for the HPS side has assigned ranges of memory that are mapped to control hardware devices. These include ethernet, SPI, SDcard, SDRAM hardware signals, and the three bridges. Only the processor running Linux sees the MPU view of memory. 

H2F is a high speed bridge with the HPS side determining when transactions take place, used to access logic and devices attached to the FPGA side. The H2F bridge has a block of the L3 address range assigned to it, where any read or write to that range will result in read or write transactions over the H2F bridge. 

A lightweight H2F bridge was also implemented, intended for rapid control signaling between the sides, thus it has its own address range assigned in the L3 mapping. Reads or writes to this address range become reads or writes over the H2FLW bridge. The hardware does convert the address used, thus if we write to the first word of the H2FLW bridge range in the MPU space - xFF200000 - that appears on the FPGA side's H2FLW bridge interface as address 0. Thus there is lots of mapping one has to do on top of the range complexity I mentioned. 

The F2H bridge has the FPGA determining when transactions take place. This bridge has a range of addresses from x00000000 to x3FFFFFFF that corresponds to the L3 address mapping. If the addresses are where HPS side devices, such as ethernet, are mapped, then the FPGA can directly control those devices. If the address used is within the SDRAM visible to L3 mapping, then the FPGA can read or write to the SDRAM there. It can also read or write to the ROM and other boot time memories that are visible in the L3 range.

My original design concept was to have the virtual 2315 cartridge file from the SD card be memory mapped by Linux and send the start address over the H2FLW bridge to my logic in the FPGA. I would then read and write from the SDRAM addresses visible in the L3 map view over the F2H bridge. 

The twist here is that all I see are physical SDRAM addresses, but the memory mapped file is in the virtual address space of Linux. I would need to force Linux to assign the block of memory for the memory mapped file to contiguous blocks of physical memory whereas normally the contiguous range of virtual addresses is strewn around in different physical blocks (pages). Even worse, due to demand paging these can change over time, with the same virtual address being held in different physical addresses. 

It would take quite a bit of Linux wizardry to pull of the feat of having my contiguous virtual address range for the memory mapped file correspond to a contiguous physical address range that would never be paged out. Different aspects of this are possible but the effort is not straightforward.

EVOLVED DESIGN CONCEPT

An alternative approach which I settled upon is to reserve the last 1MB of the 1GB SDRAM for the sole use of my FPGA logic. Boot time parameters tell Linux to not use that last megabyte, so that the HPS side never reads or writes to those addresses. My F2H bridge can issue reads or writes to that range and merrily make use of it.

The open issue is how I will get the file from the SD Card to this reserved memory in SDRAM and how updates parts of the file can be written back to the SD Card. Normally the SD Card is controlled by Linux on the HPS side, but then there is no method for Linux to write to the reserved 1MB of RAM. 

Two possibilities exist. First, I can loop data through the FPGA to move a file from Linux over to the reserved 1MB. Second, I can implement full control over the SD card from VHDL in the FPGA, which means that Linux no longer can use that card.

I planned a rich user interface running under Linux on the HPS side, able to display and select various virtual 2315 cartridge files on the SD card. This requires that Linux control the SD card. That makes the second method, direct access from FPGA, undesirable. 

I plan to make use of several bridges and links between the HPS and FPGA sides. I can use the F2SDRAM connection that gives me straightforward access to the SDRAM controller on the HPS side so that I can read or write any SDRAM address. I can use the H2F bridge so that Linux can read and write the virtual file data for me then exchange it with my logic in the FPGA. Finally, the H2FLW bridge lets me trigger a load of the cartridge file to my reserved SDRAM, or trigger a fetch of the updated file from SDRAM. 

SETTING UP THE BRIDGE MODULES

The bridges between HPS and FPGA sides are somewhat complex interfaces. Here is the list of signals one must interact with to use just one of those native memory mapped interfaces:

h2f_awid                              : out   std_logic_vector(11 downto 0);

h2f_awaddr                            : out   std_logic_vector(29 downto 0); 

h2f_awlen                             : out   std_logic_vector(3 downto 0);

h2f_awsize                            : out   std_logic_vector(2 downto 0);

h2f_awburst                           : out   std_logic_vector(1 downto 0);

h2f_awlock                            : out   std_logic_vector(1 downto 0);  

h2f_awcache                           : out   std_logic_vector(3 downto 0); 

h2f_awprot                            : out   std_logic_vector(2 downto 0); 

h2f_awvalid                           : out   std_logic;

h2f_awready                           : in    std_logic                     := 'X'; 

h2f_wid                               : out   std_logic_vector(11 downto 0); 

h2f_wdata                             : out   std_logic_vector(63 downto 0); 

h2f_wstrb                             : out   std_logic_vector(7 downto 0); 

h2f_wlast                             : out   std_logic;

h2f_wvalid                            : out   std_logic;

h2f_wready                            : in    std_logic                     := 'X';   

h2f_bid                               : in    std_logic_vector(11 downto 0) := (others => 'X');

h2f_bresp                             : in    std_logic_vector(1 downto 0)  := (others => 'X');

h2f_bvalid                            : in    std_logic                     := 'X';

h2f_bready                            : out   std_logic;

h2f_arid                              : out   std_logic_vector(11 downto 0);

h2f_araddr                            : out   std_logic_vector(29 downto 0); 

h2f_arlen                             : out   std_logic_vector(3 downto 0); 

h2f_arsize                            : out   std_logic_vector(2 downto 0); 

h2f_arburst                           : out   std_logic_vector(1 downto 0); 

h2f_arlock                            : out   std_logic_vector(1 downto 0); 

h2f_arcache                           : out   std_logic_vector(3 downto 0);

h2f_arprot                            : out   std_logic_vector(2 downto 0); 

h2f_arvalid                           : out   std_logic; 

h2f_arready                           : in    std_logic                     := 'X'; 

h2f_rid                               : in    std_logic_vector(11 downto 0) := (others => 'X');

h2f_rdata                             : in    std_logic_vector(63 downto 0) := (others => 'X');

h2f_rresp                             : in    std_logic_vector(1 downto 0)  := (others => 'X');

h2f_rlast                             : in    std_logic                     := 'X';

h2f_rvalid                            : in    std_logic                     := 'X';

h2f_rready                            : out   std_logic;

The above 36 signals are used on five distinct protocol channels. Signals that begin with h2f_ar are the address to be used in a read. Those starting h2f_r are the data used in a read. Analogously, h2f_aw and h2f_w are for the write address and write data interchanges respectively. Finally, there is a h2f_b set of signals which control the channel where status responses are exchanged. Within each channel there are details like caching, strobes, locking and protection that would have to be handled. 

Quartus provides a memory mapped pipeline bridge module which handles the complexity of interacting with the native interface, exposing a simpler set of signals and required protocol that I can make use of:

f2sdram_address                       : in    std_logic_vector(28 downto 0) := (others => 'X');

f2sdram_burstcount                    : in    std_logic_vector(7 downto 0)  := (others => 'X');

f2sdram_waitrequest                   : out   std_logic;

f2sdram_readdata                      : out   std_logic_vector(63 downto 0);

f2sdram_readdatavalid                 : out   std_logic;

f2sdram_read                          : in    std_logic                     := 'X';

f2sdram_writedata                     : in    std_logic_vector(63 downto 0) := (others => 'X');

f2sdram_byteenable                    : in    std_logic_vector(7 downto 0)  := (others => 'X');

f2sdram_write                         : in    std_logic                     := 'X';          

The interface immediately above with its 9 signals is much simpler and more straightforward. Supply an address, set a signal to ask for a read and grab the data when the readdatavalid signal is received. Qsys offers the simple protocol for reading and writing as a guide to the logic designer:

My FPGA logic sees the MM pipeline bridges to both the H2F and the H2FLW bridges. The F2SDRAM interface is itself set up to the simpler protocol of the pipeline bridge, thus I don't need to add a bridge for that channel of communication and still get the simple nine signal interface. 

MY TASK IS TO READ FROM ONE BRIDGE AND WRITE TO THE OTHER

The logic behind loading a virtual 2315 cartridge file into the reserved 1MB of SDRAM is pretty straightforward. I begin receiving words from the Linux program over the H2F bridge. I take each word received and write it to the SDRAM area over the F2SDRAM bridge, converting to the proper address of the word within that last megabyte of SDRAM. The virtual cartridge consists of roughly 500,000 16 bit words, but with a read and write width over my bridges of 64 bits, that means I have to read from H2F and write to F2SDRAM about 125,000 times. 

When the Linux program implementing the user interface has a new file selected to load as the virtual cartridge, it sends a control command over the H2FLW bridge which triggers my logic to iterate over reads from master and writes to SDRAM.

Once the drive goes not ready, the user interface program sends a command over the H2FLW bridge to unload the presumably updated virtual cartridge contents. This triggers my logic to read the SDRAM words over F2SDRAM and write them to the Linux program using the H2F bridge. The Linux program updates the memory mapped file which causes the changes to be written back to the SD card file. 

No comments:

Post a Comment