Wednesday, September 4, 2024

Wiring up the header blocks for my Cycle Steal Memory Loader

LATEST VERSION OF MY LOADER WHICH PUTS FILES FROM A PC/MAC INTO CORE

Earlier versions would toggle switches and push buttons virtually to match the actions of a human operator using Load mode of the 1130. This however was slow mainly due to the debouncing interval for the buttons. 

I redesigned the loader to use cycle steal, which will trigger a core memory cycle to read or write a specific address. This will be much faster, allowing even the largest programs to be loaded into memory in almost no time compared to the hour or more with the prior loader. 

CYCLE STEAL - DMA BY ANY OTHER NAME

Modern computers allow faster peripherals to transfer data to memory without requiring the processor to explicitly read the data and write it to memory. Instead, a mode called Direct Memory Access (DMA) allows a peripheral to request access to memory, transferring data without passing through the CPU. This means the CPU does not need to take interrupts or waste instructions moving the data. 

IBM calls this technique Cycle Steal when they developed it for computers in the fifties and sixties. Since mainframes of those eras are built around core memory, the heart of their structure involves memory access cycles. I will use the 1130 as a more concrete example.

The processor takes a memory cycle called I1 to fetch the contents of a memory location which becomes the new instruction it will execute. If the instruction has the bit on indicating a long format, then the instruction is two words long; the machine takes another memory cycle called I2 to retrieve that. 

Instructions that make use of an index register must take another memory cycle called IX to read and sometimes write the memory location for the index register (addresses 0001, 0002 and 0003 are the three index registers). 

Indirect addressing is indicated by another bit in the first instruction word, which requires the processor to gain grab the contents of memory with an IA cycle. That is when the calculated target address of the instruction is just a memory location which has the true intended address inside it. 

Performing the purpose of the instruction can involve additional memory cycles, up to three, which are the E1, E2 and E3 cycles. Thus, a long, indexed, indirect instruction could take up to seven memory cycles to complete. A memory cycle is therefore the atomic operation of the 1130. 

Cycle stealing blocks the processor from making a memory cycle, instead allowing a peripheral to have one memory access cycle. These can occur between any sets of other cycles. An instruction might have completed its I1 but then a cycle steal uses the memory before the processor resumes taking its IX cycle. The processor and software is not aware that cycles are stolen, the only evidence is elongation of the time it takes to execute software because of the 3.6 us it takes for a memory cycle. 

Fast peripherals raise a flag requesting a cycle steal. As soon as the current memory cycle completes, the requester is given the memory to read and rewrite one address. For example, disk drive uses cycle steal to transfer 320 words between disk and memory, slipping the memory cycles in between instruction oriented cycles as needed. 

There are several levels of cycle steal, with higher priority levels given access to memory first before lower priority requesters. When no cycle steal request is present, the CPU resumes executing instructions. 

My new loader has some circuitry attached to an Arduino. Using the USB cable to open a serial connection with a PC, terminal or Mac, the software in the Arduino requests cycle steals. The loader requires that the processor be stopped before it will perform cycle steals, which eliminates the need to worry about priorities among cycle steal requesting peripherals. 

The loader raises the request for a cycle steal on level 0. When it sees that the processor has given control of memory to cycle stealing (a CS Level 0 signal becomes true), it sets up the address and data on the same buses that other peripherals use, then at appropriate steps during the memory cycle it activates the File Entry Gate signal so that the memory will write the data we injected. As we see the cycle near completion, we drop the signals and wait to see the CS Level 0 end. 

The memory cycle time is 3.6 microseconds, which is how long it takes to load one word of core memory. The Arduino then has to accept the next line of input over the serial connection to request the next transfer. The Arduino processing and serial line time is much bigger than the cycle steal time, thus that becomes the pacing element in how fast we can load core memory using this technique.

HEADER BLOCKS INTERFACE ARDUINO TO BACKPLANE PINS

I created small header blocks that are fastened above the compartments A1 and B1 of the B gate. Wire wrap is used to connect a pin from the backplane to a pin on the header block. Thus for compartment B1, we connect address bits 1 to 15, data bits 0 to 15, and File Entry Gate signal to the header block. For compartment A1, a small number of control signals such as CS Level 0 Request and CS Level 0 active are connected to the smaller header block above the backplane. 


These header blocks also attach ribbon cables which connect those signals to the shield that is fitted atop the Arduino. The Arduino stack is mounted inside the machine as well, with the USB cable routed out of the back of the machine for attachment when loading contents into memory. 

WIRING PARTLY DONE TO THE HEADER BLOCKS

I used my wire-wrap tool and appropriate wire to connect each pin of the header block to its assigned pin on the backplane. I wanted this as neat as possible, which means fairly close to the shortest distance but without any strain on the wire. 

I completed the wiring to the A1 header block and did half of the B1 header block. The sixteen data bits are still to be wired, then it will be done. 




No comments:

Post a Comment