Sunday, January 31, 2016

Design work for virtual 1403, virtual 1442, real paper tape and real plotter in SAC Interface Unit

The 31st is my birthday which means family time and not 1130 time, but I did some design work the night before and will jump back into testing on Monday the 1st.


Virtual 1442 design

The restructured design for the 1442 card reader/punch is esthetically satisfying. The device moves cards through a sequence of stations, from the input hopper to pre-read, through the read station to pre-punch, then out to one of the two stackers. Cards advance one step on each read, punch or feed operation.

The programmer issues an XIO Control with a code to drive a feed cycle with or without a read or punch operation. Any of them will move the cards one step. If it is a read, then the card moving through the read station will generate  80 interrupts on IL0, one per card column. If a punch, then the card moving through the punch station will generate up to 80 interrupts on IL0, one per column until it sees bit 12 on which terminates punching for the remaining columns

For reading, the programmer must respond to each interrupt on level 0, issuing an XIO Read to pick up a word with a bit for each of the 12 rows that has a hole punched in it. Once the card exits the read station, the reader raises an interrupt on level 4 to signal end of the read operation.

For writing,the programmer responds to each interrupt on level 0 with an XIO Write to send a word that determines which punches are fired for the current column. Once the card exist the punch station the reader raises IL4 to signal end of the operation.

A pure feed operation receives an interrupt on IL4 once the cards have moved their one step ahead. No interrupts on level 0. A read or write could be considered to have a feed operation at the end of it.

To model this, I will have two eighty word buffers, one for the pre-read station and one for the pre-punch station. Records in the PC input file are the 'hopper' and on each feed operation the PC will push down the 80 column card from the file and put it into the pre-read buffer. As each feed occurs, the card in the pre-punch buffer (updated if a punch took place) is pulled up by the PC program and added to the output file.

Thus, the PC side is going to see the read, punch or feed request as an XIO Control and it will push the next input card down to the preread station. If no file is open, but a punch is requested, then blank cards will be delivered. Once the feed operation occurs if there was a read or punch, the PC side sees a false XIO - I chose XIO Init Read - which tells it to pull the card from the pre-punch buffer and stick it into the output file.

If the stacker select bit is on for the XIO Control that starts a read, punch or feed, it is delivered as word 12 with the poll transaction. The PC program can optionally open two output files, using stacker select to determine which gets the card image, otherwise the stacker select is ignored and they all go into the one file. If no output file is opened, then the data is not fetched from the pre-punch buffer and any punching operations are ignored.

Virtual 1403 printer design

The design of the virtual 1403 printer is simpler, as it does not have input and output in the same device. The 1403 uses cycle steal (XIO Init Write prints the line) to fetch the 132 columns of data, thus this will work much the same as the 2310 and 2501. A virtual carriage control tape of 10 bits per line is created in the FPGA and contains 1 where-ever a hole would be punched in a real tape.

The FPGA will directly handle all the line space and tape skip operations, setting the DSW with any carriage control tape holes encountered. When any line is printed with the XIO Init Write, word 12 of the poll delivers the line number for the line. The PC program can insert blank lines to the output file to reflect any skip and space operations which had taken place.

Physical device interfaces

I have several physical devices to interface

  • paper tape reader and punch
  • plotter
  • 2 9 track tape drives
  • 3 RK05 cartridge disk drives (similar to 2310)
  • 3 top load cartridge disk drives
  • HP line printer (to act as 1403)
  • Documation 1000 cpm card reader (to act as 2501)
  • IO Selectric typewriter (to act as 2741 terminal for APL/1130)
The two I will work on are the paper tape and plotter devices, which are easiest to get working and already partially implemented in the FPGA. The rest will wait until later.

Saturday, January 30, 2016

Virtual 2310 disk now almost at parity with real drive performance, plus another 1130 on its way to restoration


After a few days, the carrier return is much more reliable now as the oil has worked out of the mechanism. I still get rare situations where the carrier jumps past the left margin (probably a slow unlock of the CR mechanism but could be other issues). Indexing is also rarely failing to fully step one line, but I can see a sticky lever where residual stale lubricant is causing a problem.

Oddly, when I press the button that should space one position, the 1053 will perform a tab instead. Some maladjustment to track down. Also have to keep exercising this, as it gets better with further use (but slow rate of improvement).


I have just heard that the LCM in Seattle has acquired an 1130 system with plenty of peripherals. Important to me is that this system has an 1133 multiplexer box and 2310 disk drives. There are no online ALDs for the 1133 but I can hope the books at LCM can be scanned and shared.

I assume they will restore this system as well and I may very well help out a bit if asked. Exciting to see yet another 1130 that will breathe again.


The validation error I experienced might be a race hazard or flaw in fetching words from core, even though I believe I found and fixed the flaw that was causing the problem - in the python code. However, it will be important to prove out the fetch reliability since this will be key to writing to virtual disk, printers or card punches.

Further testing shows that my fetch is not reliable, while store is rock solid. The fetch function involves latching up and storing words coming from cycle steal core fetches, but is spread across several FSMs and timing is very important in order for this to work reliably. Digging into the FPGA logic for this function.

I rewrote how the data words are latched during a set of 1 to 11 reads for a fetch transaction, as well as the logic that populates the outbound packet to return the data values to the Python program. Testing begins this afternoon.

My verification of memory still fails, but the write is perfect. Further, I did some disk writes and reads to the virtual 2310 cartridge and those results are perfect as well. I initialized a fresh virtual cartridge using the DCIP utility on the real 1130, then dumped its first sector to the console printer.

I took the virtual cartridge file I had initialized and attached it to the 1130 simulator, running the same DCIP utility there and it compared word for word identically to a virtual cartidge initialized by the utility on the simulator. Very happy with the accuracy.

More importantly - the speed of the virtual drive is immensely improved now. I initialized the virtual disk cartridge in just over six minutes. My estimate for the performance of a physical 2310 drive on my real 1130 was five minutes. The performance penalty will be barely noticeable as it is. I do have two further ways to improve performance:

  • Further decrease the short packet timeout of the USB driver, which is currently a bit longer than 5 microseconds but could safely go down to about 660 nanoseconds and still work with my packets since I write these as a sequence of 12 words in about 250 nanoseconds. 
  • Each disk read or write is doing a transaction to fetch the current cylinder for the virtual arm, but this information can be piggybacked on the transaction which returns the XIO function, modifier, reset bits, WCA and DSW. I have five spare words in the packet. 

I will make both changes since they are relatively simple. Meanwhile, I have to figure out why I am running into trouble with my core load verification function somewhere in the Python code. I keep looking at the Python code and can't find a problem. It leads me to suspect that my SAR auto-increment might be throwing in an extra increment on reads. This will take some instrumentation to check things out.

The python program opens with five reads of 11 words, pausing between each to display the SAR on the LEDs. The addresses and content were always correct. I went back to exhaustively reading the code when I FINALLY spotted the cause. My logic for extracting the words that were read is to step an index up from 0 to 10, extracting words in their low to high address sequential order, while decrementing the remaining count from 11 down to 0.

In my loop to check that zeroed locations are indeed 0x0000, I had written pcnt + 1 rather than pcnt = pcnt + 1 thus my increment was tossed away silently after executing, leaving the index unchanged.
Changed it and verified that all was well now.

When I ran the utility to initialize the virtual disk cartridge again, it ran along up to cylinder 97, almost the halfway point, and then the cylinder never advanced and the code looped. I tried to restart but the device was virtually busy, which implies that my seek logic in the fpga somehow got wedged,leaving the drive perpetually busy and in 'motion'.

I tried again and was successful on both fronts. The core load function successfully verifies now and the disk initialization proceeded through to completion. It ran in five minutes 30 seconds, a bit faster than last time due to the two minor improvements I added. I am satisfied with disk performance and with the functionality of the virtual disk drive.

I will turn my attention to debugging the virtual 2501 next, as it is substantially complete. The virtual 1403 and 1442 devices require some FPGA logic development and restructuring, which is why they will wait a bit.

Friday, January 29, 2016

Steadily working my way through the new SAC Interface transaction engine and functions that use it

Visited the Computer History Museum Wednesday at lunchtime - both 1401 systems mostly working We had some intermittent core memory addressing problem that went away before we could chase it down fully, but we think it is the gate for Not 1 in the Units address position. Then had reader checks that went away after reseating the brush connectors on the 1402 card reader.


I experimented with removing some oil and grease from around the torque limiter and CR clutch springs, hoping that the speed and energy of the return will increase. It seemed to go the wrong way initially, which suggests I moved some oil inside with my first treatment.


My core loading function is placing values in core, but skipping over some locations. It puts 4400 in location 0100 but then 0101 to 010B are skipped, before the next word 0F1F is placed at 010C instead of 0101. After handling a few cycle steal write transactions, when I tried to set the SAR to the next starting location for another group of words, what came back didn't match what I sent.

Suspects include my auto-increment function for the SAR and the logic to save incoming words and copy them to outgoing buffers. For the first problem, the approach to take is careful examination of the FSMs involved looking for race hazards or other possible flaws. The second is more puzzling since the first SAR setting transaction worked properly.

This continues to be hard to track down - I keep developing new diagnostic LED assignments and testing, hoping to discover the root causes. A big part of the problem is the poorly documented USB transfer mechanism from opencores atop a complex Cypress mechanism, coupled with the poorly documented pyUSB module. I really, really, really need a rock solid transactional transport to make this system work. Time for some research, fresh thinking and experimentation.

I decided on a restructuring of my code on the PC side to send all 12 words as a single write to the USB, just as the return of the 12 word results is a single read. I then redesigned the main read/write logic to the USB from the FPGA side to pull in all 12 words as a continuous sequence.

I make the assumption that the FIFO in the USB hardware is filled with all 12 words thus I won't have any pauses between words. I can then set the system up as twelve sequential cycles in the FSM, no interlocking or conditional state changes, which hopefully will make the transport more steady.

A similar change is made to the write side, sending all 12 output words in a continuous sequence of 12 cycles. Finally, I added a signal that tells me when we start reading words but have not finished fetching the last of the 12. It is useful for those FSMs that get triggered as soon as I see the command in word 1, but need information from the entire incoming packet.

To test this, I built a testing scaffold in the FPGA that will light diagnostic lamps to validate that I have the proper contents from a test transaction I will send in from the PC. This will have fixed values for the various data words, which I will check and use to light the diagnostic LEDS. I should immediately see if my inbound half of the transaction has worked properly.

The second half of the scaffolding emits a different fixed set of words as the return packet, which I will examine in the Python code to determine if my outbound side works properly. These two tests will let me narrow down my focus on further improvements.

Having completed all the myriad adjustments in the fpga, except that I am only echoing back the incoming words rather than replacing them with new content. I can switch this function to a fixed set of output values when I am ready for those tests.

I did uncover a fascinating fact that helps explain the reason that my minimum transaction is in the range of 1.5 ms. The logic in the ezusb_IO module from Opencores is set up to send short packets (e.g. packets smaller than the block of 1024 that is the default configuration) after a timer expires from the last write from the FPGA.

This time is defined as the number of sets of 65, 536 clocks that have occurred - with the smallest value you can set for this is 1 - thus about 1.3ms from when the packet arrives and I begin reading words until the outgoing short packet is flushed out, containing my 12 words of real payload. If i set this parameter to 0, the USB will wait forever to fill a 1024 byte packet.

If I can change this behavior to flush out the short packet much faster, I should materially improve the performance of all the USB transactions. Very important discovery, and I quickly found the logic involved in the Verilog code for the ezusb_io module. It checks for the top half of a 32 bit counter to match the timeout parameter, thus the logic is in units of 65,536 clock ticks. I just need to change the scope of the comparison to shorten the timeout substantially.

I decided that 256 clock cycles or roughly 5.3 microseconds is long enough. Change made and included with other changes in my testing regime. The echo transaction did not return the expected values, so working with LED diagnostics to figure out what is captured on the read.

After a few runs to check the words seven bits at a time, I confirmed that what was sent back was the same data we had latched on the inbound side. Not what I thought I was sending, but that could be multiple problems. It might be defects in how I latch incoming words on the link, but also it could be due to the changes I made to try to write all 12 words from Python as a single USB write. I put in some diagnostics on the Python side to show what is going outbound.

I corrected a flaw in my Python code to sent out the 12 word group and began to get back nearly the right data. It seemed I was one off, due to a cycle delay on startup of reading messages. Now the test message came back perfectly and my machine status transactions are working properly.

I moved back to debugging the core load function. Things were looking up. I loaded blazingly fast and ran into verification problem for the sections where I used the "Z" code in the PC file, which asks to write nnn words of zero beginning at the current SAR. I have some flaw in the Python code either writing or verifying those zeroes - as a first hypothesis - and will look at code and then build some diagnostics if necessary.

I believe my problem is in the logic to verify the contents. I set the SAR with a call and then prefetch a block of 11 words from core, pulling from them as needed. The problem is when I encounter a "@" directive in the PC file, which says to jump to a new block of core. I need to fetch a new block of 11 words from this new location.

Another problem occurred when I encountered a Z directive and fetched the next block of 11, when I should just have continued unpacking from the buffer and checking for zeroes. That resulting in my skipping ahead of the intended location in core, ensuring mismatches after this point.

Change made, testing again, this time trying to verify the contents of core manually. The first block of storage - 11 words - was perfect. The next words were a copy of the first block, not a new group - uhoh. I also had a failure where the code to test contents received different contents than were actually in memory.

The problems I now am shooting could be caused by either or both sides - PC program or fpga logic. Back to the diagnostics - emitting messages at key points in the core load code and turning on status lights to ensure that my cycle steal read, CS write, fetch/store and other FSMs are properly resetting after each transaction.

My first test mainly highlighted problems in my diagnostic writes, Quickly fixed, then I could see what was really occurring. The reason for my repeated sets of the same data seems to be that my FSM which walks through the eleven words, using the cycle steal read or write FSM repeatedly, is somehow taking a second pass through the FSM, writing the same 11 words twice. Should be easy to spot now that I know what is happening.

Problem found and fixed by adding a last interlocking step in the FSM that drives the 1 to 11 word CS read or write cycles. I manually checked core and validated that the first half of the function, which writes core, had worked properly. The remaining flaw is in the verification process.

A quick look showed me that I was fetching the first 11 words at the correct address, but then fetching another time so that I really had words 12 to 22 not 1 to 11. This is a flaw in the Python code doing the verification - will find and fix it.

While doing this last testing, I had a troubling symptom. I flipped the power switch for the 1130 and it flickered but didn't come on. I had to turn the main breaker off then on to get it to come up. The first time it came up, it stayed in 'reset' mode but after a power cycle it was back to normal. I hope this isn't an emerging problem with the machine.

One last test proved out the functioning on my smaller test file, so I let it rip with the full DCIP file which will load all 16K of core. That is a good relative speed test for my code that will write 16,384 words then read them all back. At full bore 1130 speed that would take a bit more than a tenth of a second, at 3.6 microseconds memory access speed.

Since I made the change with the short packet timeout, lowering it to 5.3 microseconds, we are operating on a similar timescale to the 1130 itself. I am packing multiple words, close to 11 per transaction, or half a microsecond per word added to the memory access itself. When I loaded the core file that does the write then readback of 16,384 words, it seemed to take less than a second. The results were perfect, I hit the Int Req key and the utility began typing out the menu on the 1053.

From a speed standpoint, this is exceptionally good news. It tells me that my virtual disk should run acceptably fast - it may be necessary to add in the delays while the virtual disk platter rotates the requested sector under the head. I will do the testing tomorrow on the virtual disk function.

I do have a remaining flaw in my validation routine, where it seems to get out of sync when it is reading back sections of zeroes written by the 'Z" directive. I will look at that as well tomorrow. For now, I am very pleased with the progress.

Monday, January 25, 2016

Working on core load function - fpga logic debugging


Cleaned up the code and kept testing. I was fairly sure that the Python side was packing up the right transactions and sending them in, but the FPGA was not storing them in core nor fetching the contents back. A flaw in my FSMs I have to hunt down.

Study of the code showed one weakness. My FSM that loops through 1 to 11 words of cycle stealing for a CS read or CS write transaction began as we were reading in word 2 of the incoming words, but it should wait until the full incoming payload is latched so that we store the correct data into core. I found a few other more minor issues as well, but stuck in some diagnostics to prove that I was seeing certain signals and states of FSMs.

Two steps forward but one step back. Somehow I am out of sync again on my output words. This will take a bit of headscratching to figure out.

Sunday, January 24, 2016

Now processing 12 word transactions, working on core load function


I finally diagnosed the problem and corrected it. I was using an opencores module for the use of the EZ USB link that comes on the Ztex board. The documentation is rudimentary, just comments on the instantiation template.

There was a comment "when DI_READY is 0, DI and DI_VALID are hold". Are hold? That means only switch the value to be written out when the ack signal DI_READY tells me the prior word is accepted to go up to the PC. I though I changed it when I had de-asserted the DI_VALID output. A change to my FSM from a Moore to a Mealy type got me what I wanted.

I moved on to testing the core memory load from the PC file, which exercises my new Python code to pack transactions with up to 12 words, but flush them when a change of core address or other event requires me to write out the previous partial block. I discovered some minor issues but a slightly bigger design mistake I made in the Python code

When reading from core, I could accept a full 12 words by putting the first core value in word 1 of the response, while on most transactions the first word of the response echoes the command sent. However, for writing to core, the first word of the transaction MUST be the command, thus I only can send 11 words in the remainder of the transaction. Oops, clearly didn't think this through.

I modified all the code to handle 11 word transaction packages. Back to testing after dinner. Debugging the packing and flushing code, which ensures that I get the transactions down to the FPGA. I will then have to debug the logic that actually writes these transactions out to the 1130 core memory. Each step requires creation of some debugging instrumentation, thus it moves slowly.

Saturday, January 23, 2016

Logic to return values from FPGA not working perfectly

Today I have a wake for a good friend which will run until late tonight. I did get a bit done but not much free time. Tomorrow I have a board meeting for the Digital Game Museum, but that will only be a half day.


I resumed testing of my new 12 word transaction packet with the SAC Interface board. I found that my machine state poll got back data that didn't match my expectations - I seemed to inject an extra word and get the data packets out of sync. I am looking over FPGA logic and inserting some diagnostic printing to help me chase this down.

I made some changes to the fpga but got the same results. The inbound transaction is 0xA54C which is command 12 (poll state of machine), device 20, and the standard pattern 101010 with the next 11 words set to 0x0000. What I should see is the command word 1 echoed as the first returned word, then status words with a prefix of 1, 2, 3, 4, 5, 7, 8, 9, 10, 11 and 12. Instead I see the command word in the first two return words, then the status words offset by one word.

Friday, January 22, 2016

Solve problem with 12 word transactions to SAC Interface unit


Since I had devoted some of the diagnostic pins from the fpga board to new functions - interrupt levels 0 and 1 request and state, plus the command to issue a program load sequence - my old connectors to the bank of LEDs won't fit. I had to make up a new set of connections to LEDs.

I also have a dim memory that certain of the diagnostic pins are dead - but not which ones. I set up a process to alternate through all the LEDs one by one at intervals of a tenth of a second. That will allow me to know which pins are really available before I assign them to diagnostic state for my transactional engine logic.

The first time it ran a bit too fast, but with it slowed to once per second I could easily check out all the pins. I then found LED 5 of the first block of 8 was not working. Since I only had seven signals to play with, I just set them up as 1, 2, 3, 4, 6, 7, 8 and was ready to begin debugging.

My first set of diagnostics were intended to display in binary the step in the PC link FSM where the machine had stalled. I can encode all of the 50 states in just six LEDs, thus I used the remaining one to show me that the main transaction FSM had moved off the idle position.  The results were unexpected. I saw the lights flash through an entire transaction and go back to the idle state for both the PC link and the tranactional FSMs.

The fpga fired off the transaction, wrote out 12 words and returned to wait for the next transaction from the Python program in the PC. However, the Python code timed out trying to read the first inbound words. This is puzzling. I guess I need to put some diagnostic code in the Python program to figure out what is happening.

I wrote the 12 words out with pauses at strategic points, allowing me to see if the FSM for the link is at an appropriate place. Similarly, I pull the returned words in batches with pauses. I should have an idea where the failure is occurring when I run this code.

It occurred to me that this may not be a timeout, even though my error message reports it as that. I have the usb read in a 'try' block and assume it was a timeout. There may be a subtle error that is thrown which I am masking because of my assumption. I corrected the code to that it prints the exception itself, allowing me to understand what is occurring.

When I ran the test, I confirmed that this was not a timeout due to the fpga getting out of sync. I had an 'Overflow' error, which is because the fpga wrote 12 words but I was reading only the first 2. The PyUSB module throws an exception if the packet is shorter or longer than the quantity in the read statement. I restructured the code to read all 12 words at once and was back in business. The monitoring function was polling and receiving twelve words per transaction.

It is late at night and my testing will end for today. Once I verify that the returned data is correct, matching the 1130 machine status, my next two tests are loading core from a PC file and virtual 2310 disk IO. These both depend on my logic to buffer up words in groups of up to 12, flush a group when needed and pass those down to the FPGA for storing in core.

Read cards for 1401 music producing program, debugging my 12 word transaction engine on the SAC Interface


I fired up the system for testing and found that my fpga and PC transactional links are no longer in synchronization. The Python program writes 12 words out and pulls 12 words back, but times out during this activity before the first monitoring transaction has completed.

The error is almost certainly in the fpga side - time to scrutinize the logic for the link and the transaction. I will also look at ways to use some LEDs as diagnostic signals from the fpga. I found some places that looked suspect in the basic FSMs for transactions and

My first tests show me that I wrote the 12 words out but when I tried to read back the 12 inbound words, I timed out trying to fetch the first pair. Definitely time to hook up the LEDs to the remaining few signals on the FPGA board and clue me in to where I am stalling.

Once the 12 word transactions are reliably swapping for the background status monitoring task on the PC, the one that displays the registers, clocks and so forth, I can debug the slightly more complicated code to load core with a PC file, stuffing up to 12 words per transaction. After those work, it should be straightforward to get the mirror 1053 and virtual 2310 functions working.


One of my teammates at the 1401 restoration group, Stan, found several music playing decks for the 1401 (picked up by an AM radio held near the system while the programs are running). He asked me to scan these in so that another member, Van, can examine the code and work out the best version for demonstrations at the museum.

I used my Documation card reader to read these as both ASCII and raw card files. The ASCII files use a translate table developed for most IBM 1130 Fortran and Assembler decks, selecting which hollerith patterns (holes on the card) become certain ASCII characters.

I had some read checks on one deck, because blank cards had been inserted to separate what is probably different songs within the deck. To make them visibly distinct, someone picked red card stock. Since they are totally blank, the orientation shouldn't matter except for one weakness that remains with the Documation readers.

They were notorious for throwing read checks if cards had the diagonal notch on the right upper corner of the card. I had developed and installed a hardware fix that eliminates this error, by ignoring the top row of the card when the machine is doing its error check at the column 81 time. In the blue music deck, one red card was placed with the diagonal notch at the lower right corner, where it tripped a read error. I could have added row 9 to the hardware modification, but this is a very unlikely situation which was easily rectified by flipping the blank red card around.

The ASCII mode is not a complete translation, both because some 029 keypunch characters don't exist in ASCII, and because the 1401 used 026 keypunches with different assignments of hollerith to BCD and therefore to ASCII. Finally, the mapping of card columns to text files is inexact.

This is why I included a raw card image file that ensures zero data loss from the cards - Van can use the combination of the two formats to be sure he has workable programs and data. The work was done quickly (after I found the errant position of the red card) and the filed emailed back to Van and Stan.

Wednesday, January 20, 2016

Work at CHM, new lift cart and continued work on SAC Interface logic

I had ordered up a cart with hydraulic lift, having seen the lifts used by some fellow enthusiasts, to handle the disk and tape drives that typically weight between 50 and perhaps 100 pounds each. I found a low price version at Harbor Freight, gambling that the sometimes shoddy quality of their products.

The box arrived all twisted up because the packaging was totally inadequate to ship a heavy metal cart like this, with the rolling wheels poking out of torn holes in the bottom of the carton. I found it missing two bolts which had obviously fallen out of the breached cardboard, but they are standard bolts I can pick up. Otherwise, it works fine.


I put in more time on two tasks - carefully thinking about the new flow of the 12 word transactional engine, and filtering some of the remaining 'false alarm' warnings from the Xilinx software. Still not ready to test on the machine until I have a decent level of comfort that the basic code should work right.

The other task I worked on was completing the local (fpga) handling of skip, space, accurate time simulation and all that goes with it, leaving the PC side with only the need to fetch print lines, translate them to ASCII and write to a PC file. It doesn't have to handle anything beyond XIO Init Write (to print a line) and receiving the number of blank lines to insert as a consequence of any space or skip commands that were handled locally in the fpga.


I stopped by the Computer History Museum at midday to fix a power supply that was giving others a problem. It was one of a pair of units that each provide 20A of -12V filtered power. The challenge was that this unit was the earlier version while we only have schematics for the successor version. Quite a bit different - 10 power transistors versus 6 on the newer unit, major restructuring of the overvoltage (crowbar) circuit, and other changes.

After I sorted out some of the differences I came to see that the voltage regulator circuit needs a reliable -6V reference voltage in order to set the output voltage to the target of -12. It also needed a good load to test. We had some power sources which gave me the -6V reference, but sinking 20A of power was a challenge.

We have a resistor box that has four 50 ohm and four 3 ohm resistors that can be individually switched in and out of the circuit. With all eight switched in, the box has a resistance of .675 ohms and will draw 17.777A at the -12V output of the power supply. Not the full 20A but not too far from it, particularly since the real application in the 1401 computer doesn't deliver exactly 20.0 amps.

After figuring this out and some work, I had it adjusted to deliver -12V under load and the output voltage didn't vary more than a few hundredths of a volt between 4 and 17.777 A of load. I put the oscilloscope on the load to test the ripple and noise from the power supply. Setting the scope down to its most sensitive setting, 5mv per division, I saw essentially zero ripple. The line wasn't ruler flat but it appeared that the fluorescent lights above were coupling to the probe cable to cause the microvolts of noise, not any AC supply related ripple.

Tuesday, January 19, 2016

Improving fpga efficiency and improving the 1403 function

I enjoyed a dinner with a fellow restorer/collector of old tech last night, nice to meet some of the people I interact with online. Guy and I have only a small overlap in what we work on - my P390 and his MP3000 - but similar approaches and interests overall.


A fellow enthusiast taught me how to filter the messages from the Xilinx toolchain, since I get over 700 'warning' messages from my logic, with very few that really matter. Finding them among the many false alarms has been a headache, but Marc showed me how to filter individual messages.

It was painstaking work as I couldn't select groups of messages, having to select then mark each individual message. However, by the end of the day I had filtered out more than 500 messages that are completely unneeded. I left some messages that reflect intentional choices but will be problems once I fill in some portions of the logic. I can scan through 200 messages to look for real problems I need to address.

I also worked on the restructuring of both fpga and PC logic to make use of the 12 word transaction size. At the same time, I decided on a change to my 1403 printer functionality, putting the virtual carriage control tape and all the logic to respond to it down inside the fpga.

I will begin with a fixed carriage tape, having holes in channels 1 to 8, 10 and 11 on the first 'line' and a hole in channels 9 and 12 on line 66. The PC will still need to know about skips, but I can return that easily so that the PC output file can mirror what a real printer would do. The fpga will model the time that a 1403 takes to accomplish the skips and line spacing. All the PC has to do is cycle steal to fetch the 66 words or less that contain one print line. - two columns per word.

Monday, January 18, 2016

Modifying SAC Interface to use larger USB payload


I devised a way to minimize the scope of changes as I switch my basic USB transactional protocol from a 2 word based swap to a 12 word protocol (to give me 10 to 11 words of data payload). Many FSMs are interlocked together but originally based around the concept that word one is the command and device number while word two is the data. Thus, machines wait until the transaction gets to the point where data should be latched or output data should be set up to write over the link.

My idea was to add in the additional 10 words both incoming and outgoing, but make the FSMs which are synced with this wait for the last word to come in before they take action. They won't know that there are ten other words that arrived, they will only look at the first two words just as they did yesterday. If I can get this change working properly, with a corresponding set of changes to the Python code on the other side of the link, then I can take additional steps to exploit the larger packet.

In addition to the neutral change (hopefully) that allows existing functions to work believing that transactions are still two words each way, I built out the data to optimize both device polling and status polling, making use of the twelve word block to send all relevant data in a single transaction. What will be more complex is the logic to pack multiple words of cycle steal data transfer in one transaction, so that should wait until this first set of changes is debugged.

I went ahead and made the cycle steal changes too, This required logic changes in the Python code which in many places is structured around a one word at a time transfer. I need to pack up data 12 words at a time, while flushing incomplete blocks when necessary.

Sunday, January 17, 2016

Performance analysis of virtual disk drive and improvement plans


On the testing front, I tried the new DCIP core load and my virtual 2310 logic to dump some sectors and test out initializing a virtual blank cartridge. Once the card reader is debugged I will boot a virtual DMS boot card from the reader and then continue the IPL on the virtual 2310 disk drive.

First test, dumping a sector, worked perfectly now that I have a core load of the DCIP utility that did not falsely discover a 1403 printer, as had occurred previously on the IBM 1130 simulator until I fixed a bug in its printer support.

I then attempted to initialize a virtual disk cartridge using the utility. I had not previously fully debugged the write functionality and bumped into a Python code bug. With that repaired, I tested again. I ran an initialize, which is faster than an analyze. The init code writes and then reads a pattern three times for each of the 1600 sectors, which is 9.600 IO operations, whereas the analyze will do 16 for each sector or 25,600. It will only take 3/8 the time to do an initialize as to do an analyze of a virtual cartridge. I still expect this to be very slow, probably most of a day because of the large negative performance ratio of the virtual disk to a real 2310 drive.

I ran the initialize for ten minutes and then copied the virtual disk over to check how far it had gotten on the simulator. I discovered that I had forgotten to 'seek' the virtual disk cartridge file on the PC thus was always writing over the first sector. Corrections made, I got back to the testing.

I discovered there was a silent exception taking place during my attempt to seek or write the new sector back into the file. I did a bit of research on the file IO of Python and intended to stick in some instrumentation to narrow down the issue. However, it was horrifyingly clear right away that I hadn't opened the file for read-write (rb+), only for reading. With that fixed, the program began initializing.

The utility initializes disks by writing each sector of a cylinder with three fixed patterns. It first writes 0xAAAA to all eight sectors, then reads those sectors back. It writes 0x5555 to all eight sectors and then reads back the eight. Finally, it writes 0x0000 to the eight and reads them back. That gives us 48 I/O operations per cylinder.

I timed a sequence of writing eight sectors, somewhat slower because the python program wrote a diagnostic message for each write to my debugging console. It took just under 7 seconds to complete the sequence. That gives me a sense of the speed ratio, which is not good. My virtual disk is more than 25 times slower than a real 2310.

On a real disk, if each write required a complete rotation (we missed the next sector and had to rotate once) it would take 40ms per rotation or 1/3 second for the batch of 8 to complete. If we had only two misses, getting all four on each side of the platter in one burst but having to miss once to start and then once for the head switch delay, it would have taken less than half the time. I will split the difference and say 1/4 second as an outside value.

Seeking one cylinder at a time takes about 60ms. Thus, a full pack initialize is 12 seconds of seeks and 300 seconds for writing and reading each cylinder three times. My seek will be just as fast as a real drive, but the read/write time will balloon to over two hours versus the 5 minutes this task should take on a real 2310.

With the same timing, an analyze, which is reading each sector 16 times, or 208 I/O per cylinder compared to the 48 per cylinder of the initialize, will run 9+ hours on the virtual drive but only about 20 minutes on a real device. A disk compare or disk copy would be a faster operation, requiring only two IOs per sector or 2+ minutes total time on a pair of real drives. With my current slowdown ratio, it would be about an hour long event, tolerable but not fun.

If I can slash the slowdown to perhaps 5X instead of 25+X, I could live with most of these times. I would never do an analyze of a pack, because that would still take on the order of 100 minutes to accomplish. Since there will never be an error on the virtual cartridge, it is a pointless operation to run anyhow. Initializing for 25 minutes is bearable, especially since I could init the pack on the IBM 1130 simulator on the PC side as easily.

The main operations where the utility really makes sense are for disk to disk copies and compares, where my existing requires a painful near-hour but a 5X speedup gets this to a tolerable 10 or so minutes. The bottom line, however, is that even at current speeds this is good enough to be usable, even if it would be better to improve the speed.

Time to figure out exactly where the slowdown comes from:

  • Python interpreter
  • Inefficient code in Python
  • locking and queue management overhead in my Python code
  • delays in USB IO and transit time
I am spending some time to find instrumentation to record these various factors. If there is one major cause that is reasonable to attack, it becomes the next. My first measurement was to take the core load for the DCIP utility, which involves more than 4000 transactions across the USB link, and measure the time it took to execute. I saw roughly 9.7 seconds from start until it finished, which works out to less than 2.5ms per transaction and that includes reading from a PC file and processing the text as well as the actual transactions over USB. A memory cycle on the 1130 is 3.6us so much faster.

A disk IO will involve under 330 transactions in total, cumulatively 0.8 second, while a real 2310 read or write takes .01 to .04 depending on rotational delay. This is root cause of the speed differential. My core load function runs at a mere 0.15% the rate of the real machine, with a disk IO requiring 321 core load transactions.

I have to determine what part of this is due to transactional limits, so I created a python thread that does nothing but issue transactions to the fpga, saving the time when we start the loop and the final time when we finish a given number. I can do this submitting these to the thread safe queue but also I can do this directly at startup to time the pure speed of the link out of Python.

My basic round trip transactions took about 1.6 ms each, it didn't matter if it was a very simple one that just fetches the last value from an XIO or if it involved a complete cycle steal memory cycle to read or write to core memory - the answer was identical. With a 480Mbps data link over USB, the delay is almost completely unrelated to the number of bytes transferred but instead reflects the pure overhead of a transaction.

I am planning out a change to the link where I will send 12 word transactions rather than 2 word ones. This will give me up to a ten fold increase in performance for core access and other data transfers over the link. It also allows me to consolidate all the device oriented information in one transaction, where previously I had separate calls to ask about XIO function, WCA, modifiers, etc.

There are quite a few FSMs in the fpga logic that are tightly synced to the transactional flow, thus I have to be extremely careful in the change to avoid breaking any of those aspects. This will take some time to get right but the performance advantage will be significant for the virtual 2310 and other high speed peripherals.

Meanwhile, I wanted to start some testing on the virtual 2501 card reader logic, since this was in decent shape when I last worked on it. I wanted to test out the 2501 reader, then the 1403 printer, while I am plotting out all the changes to make for the 12 word transaction approach.

Unfortunately, my Acer laptop that I use for the Python side of the link began exhibiting bizarre behavior, with the cursor flying all around the screen like it was possessed, even after a reboot. This suggests the hardware is taking a dive, which is unfortunate (and poorly timed).

Saturday, January 16, 2016

Working on other virtual peripheral functions of the SAC Interface Box


My changes to the 1130 simulator required me to split the logic to maintain two separate DSWs, since the way the utility works (and other utilities) is to issue certain XIO commands to both 1132 and 1403 printer addresses in order to determine the fastest printing device that is ready to print. With the split, the DCIP utility works perfectly in the simulator. I created a load file to use with the real 1130 system.

I next turned to the Python side code for the virtual 2501 card reader and virtual 1442 card reader/card punch devices. Armed with the lessons I have learned debugging the virtual disk, I scanned to see if the division of labor between PC and fpga needs to be changed for either of these functions.

I saw the same flaw with interrupt causing status bits in the DSW which need to be reset in the fpga, immediately upon an XIO Sense DSW with Reset, rather than up in the PC code. Both the 2501 and 1442 functions need this change.

In addition, the current 1442 function sends every level 0 interrupt for a card column up to the PC where a single column of data is sent back. This will be very slow and inefficient. I will rewrite the code to use 80 column buffers that are written or read by the PC as an entire virtual card, allowing all the detailed column by column stuff to be done fully in the fpga.

The way this will work is that the PC sees the XIO Control that does a start read or start punch or line feed, all things associated with the PC file which stands in for card decks. The PC will push down the card image to the fpga and trigger it to begin emitting the interrupts on level 0 that cause the program to issue XIO Read (or Write if punching) to move one column from the buffer.

The end of the read of a designated card image will be reflected in a pseudo XIO code to the PC program which is polling for this completion and pushes an op complete DSW, after which the fpga side handles the interrupt and sense DSW functions. For a punch, the fpga will not trigger op complete, instead waiting for the PC to extract the newly written card image from the buffer and then pushing down a DSW that fires off the op complete. A bit of logic to write in the fpga but should be fully interlocked.

Friday, January 15, 2016

Fighting Visual Studio and other toolchains - hours frittered away attempt to compile a few minutes of changed code


I was chasing down the flaws in the IBM 1130 simulator, wherein when I boot the DCIP utility and try to dump a sector, it incorrectly believes the 1403 printer is there but not ready. This is due to a bit that is returned to a Sense DSW for the 1403 (0x0080) that should be off.

The utility does strange things to the program if it thinks it has the 1403, which blocks an 1132 from working. In order to create a core load for DCIP that I can use properly, I need to boot it into the simulator where it thinks there are neither 1403 nor 1132 ready. I have to fix the bug.

Compiling the simulator was coming along nicely, once I fixed some makefile absolute disk path references that were residual from the system where the distribution was built. That is, until I had to link in a static version of the libgd library, which means I have to first download and create that library. Since that brought its own clunkiness, and the only function in the 1130 that uses this library is the 1627 plotter emulation, I disabled the support.

The code for the printer support combines both 1132 and 1403 printers, but shares some fields such as the Device Status Word. This causes some problems as activities on one of the printers can flip on sense bits that are delivered for the other printer! I patched around this behavior with the goal of getting a good memory load I can use for the real 1130.

Wednesday, January 13, 2016

Faster core access working, 2310 virtual disk working, continuing to debug functionality


I began by debugging my new version of the cycle steal engines that slashes the number of USB transactions needed to load or fetch from a large block of core. Beginning with my function to load core with a PC file such as the DCIP utility, then on to see what speedup I have gained with the virtual 2310 disk capability.

I had technical challenges debugging the memory load function since the GUI update was synchronous with the load function, blocking me from viewing the SAR value until the entire load process was done. I did play around with moving the load into a separate thread, but various Python functions that are not thread safe would cause a problem and more importantly, the diagnostic print statements won't appear until the entire program stops.

I looked at what was in memory and suspect that I have a race hazard somewhere in my fpga logic that is affected by a rapid fire string of cycle steal writes, such that sometimes data is stored in the wrong location. I looked at the code that was loaded by my function and found that about every 10th word had the wrong value in it. It is not lost writes, because the remaining data is correct and in its place.

More time needed to think and to look over the various FSMs in my design. I could build a special diagnostic core load that consists of the word addresses as their contents - allowing me to look for patterns in what is written and where. However, I did spot a potential race condition and repaired it, so that was worth a new test without changing the instrumentation.

I loaded core with all zero values then ran a few lines from the core load file and found it was skipping every other word. This tells me my logic is double-incrementing the SAR when I do a read, something I should be able to find and fix rapidly.

There was a race hazard where my FSM to bump the SAR would go back to idle while the trigger signal was still active, thus cycling twice. This was fixed and I now load core and move data as a virtual 2310 disk drive. The core load process is quicker but the virtual disk activity wasn't dramatically faster. The big delay appears to be over in the Python program.

When I have time, I will go on the 1130 simulator and build core load files for the 2310 disk diagnostic (and others such as the 1132 diagnostics) which will be handy to run. I also will create a boot card image so that I can attempt to boot DMS 2 from my virtual 2310 drive. 

Monday, January 11, 2016

Disk function appears to be working but slow - working on efficiency changes


When I ran the DCIP utility and asked it to dump a sector to the console printer, the typewriter tabbed back and forth a few times, typed RRR? and then went back to its main menu. I thought it might be a problem with the 1053 itself, so during one run I turned on my mirror driver to capture what was sent by the program to the console printer. Lo and behold, it was the same junk as I was seeing physically.

DCIP run with garbage looking dump output

I tested with my new instrumentation that would tell me:

  • If I ever triggered on an XIO Control for the disk (seek)
  • If I latched up the values for the seek
  • The relative disk seek amount that was latched

When I ran the test it was clear that I was not triggering on the seek at all. Very mysterious. After some testing with my XIO Control module, I discovered that it was not working properly, bailing out before it had a valid data word (which is the seek amount in the case of a disk drive). I updated the logic and retested. It worked this time!

When I requested that the DCIP program dump sector x0123 of 2310 disk drive D (my virtual drive), it did a seek to cylinder 36 and then read sector 3 of that cylinder. The contents were an exact match for the virtual sector on the PC disk cartridge file.

Sector x0123 is 256 + 32 + 3 or 291 decimal.There are eight sectors per cylinder thus the decimal cylinder number, when dividing 291 by 8, is 36.375. That is, cylinder 36 and sector 3/8 or 03. Sector numbers range from 00 to 07, with the first four on one head and the other four using the other head to read from the same cylinder, as information is recorded on both sides of the platter.

My diagnostic indicators were 0xE4 - the top two bits show that a Seek command was recognized and that the value was latched up. The remaining six bits are the seek amount. In this case, 0x24 which is 32 + 6 or . . . 36. I will change the diagnostic to drop my special bits and show me the active cylinder, not the seek amount. That would be the same value for a single sector dump, but would increment if multi sector dumps or other processing was occurring that issued more than one seek.

I planned for some other testing:

  • Analyze of the virtual disk pack which should do IR Check for all 1600 sectors. Not sure if it tries to read cylinders 200, 201, or 202, which are the alternatives that would be used if there were a defect discovered on up to three of the cylinders in the range 0 to 199. We shall see.
  • Initialize a virtual disk drive, which will test the write logic. Should write only a few sectors.
  • Format a virtual disk drive, which will write all 1600 cylinders and test for errors that might warrant assigning an alternate cylinder. 
  • Hook up the 1132 printer and dump some sectors to the printer, since my typewriter is too flaky to type out 321 words per sector. 
Two issues arose which I need to deal with before I do the Initialize and the Format tests. First, even with the 1132 hooked up and ready, the DCIP program did not try to print. The same thing happens on the 1130 simulator with this program - it only works with a 1403 printer set up for the output, not with 1132 configured. I have to sort this out so that I can print sectors to my line printer. The other issue is speed - it is over 50 times slower than a real drive.

The DCIP program reads the cylinder 16 times, thus it is 128 read operations. On a real drive, even if there were a complete rotational miss on every read try, it would take a bit more than  5/100 second per read, thus I should be finishing each cylinder in way under a minute. Instead, it seems to take somewhere between 3 and 6 minutes per cylinder (I didn't time it but that is my sense of the glacial speed).

I know that my logic for cycle steal read/write can be optimized to be 2-3 times faster than it is currently. When I load and then verify the contents of core from a PC file, it uses the same process and I can see that it is kinda slow. This is definitely one factor, since the analyze function is doing a real read into memory, not a read check that skips the transfer. There is a straightforward way I can juice up the speed.

Some of this is speed of the Python program, but it is also the introduction of the async polling transactions from the 2310 code which slow things down quite a bit. I will see how I can speed up both sides a bit. On a real machine, you might have to wait 15-20 minutes to analyze a cartridge, but that is way better than waiting almost an entire 24 hour day at the current performance.

My simple fix is to use a second kind of cycle steal read/write command which reads from the current address and bumps it by one. Thus, to access a sequence of locations I would only have to set the address at the beginning with one transaction and then just write words over and over until done. Now, a transfer of a 321 word disk sector requires 642 transactions but could be just 322 with the new scheme.

I will also ponder restructuring the link between PC and fpga. Since each transaction is initiated by the PC side and is self contained, two words in and two words out, I suspect that there is a lot of overhead for the relatively few bytes of payload transferred. If I increase the size of a transaction, I amortize that although the space is wasted on the simple polling transactions. Writing or reading blocks of data from core and similar transfers would be faster.

The current process for providing results to transactions is tied to the short fixed size, which would therefore require a lot of redesign to become more generalized. I don't want to take this on until I am convinced that the predominance of the delay is in the protocol itself and not in the Python interpreter. First up, the simple change to turn 642 transactions into 321.

Looking at the DCIP code, it looks like the utility should be selecting, in order, 1403, then 1132, then 1053 for output. I imagine that the IBM simulator and my fpga are both returning something that the program misinterprets as availability of the 1403. I needed to pore over the code for DCIP to figure this out.

The utility will do a XIO Sense DSW to the 1403 and look for the low bit (Not Ready) to be on. If no answer or the bit is off, it then issues an XIO to write a line, then does another XIO Sense DSW. If the DSW is all zeroes, the 1403 is not there, otherwise it tries to check the 1132.

That test issues an XIO Sense DSW to the 1132 printer, looking for a forms check (bit 5 on) as a sign that the printer is not ready. If no answer or the bit is off, it then tries to print a line and does another XIO Sense DSW. If all bits are still off, then there is no printer there and it defaults to the console typewriter. I believe that my core load which I am using to run the utility has already set up for use of one of the printers. I will check the memory location of the switch to see what is loaded there.

When I inspected the memory locations, they were zero as they should be if no line printer is ready. Therefore, I have some problem that is causing the printout of the sector to be invalid. I need to study this more to see why I am having problems with the utility, but the disk buffer content is exactly right  for every dumped sector and the analyze function was working well too. My working assumption is that I have a (slow) correctly operating virtual disk drive.

Sunday, January 10, 2016

Solid progress on virtual 2310 functionality - reads well, but seeks still have a problem


The typewriter remains inconsistent in many ways. More often than not, the carrier returns are very slow, but sometimes they are jet fast. CRs don't usually latch but sometimes they do, but that is unrelated to the hi versus lo speed behavior. Sometimes it jumps over the left margin and sticks there. Once in a great while, index sticks on. Time, patience and adjustments. That is my mantra to help me deal with the flakiness.


After some careful testing I discovered a logic flaw - a timing error - where the Op Complete status bit was reset before the XIO Sense DSW had finished strobing the DSW into the 1130. Thus, the XIO Sense DSW with reset bit on would never see the condition because it had reset it too early.

The flaw was in the flipflop that holds Op Complete. It is turned off if we see an XIO Sense DSW executing and the modifier field of that instruction has bit 15 turned on. The way I coded this was to see if the XIO Sense Device was occuring (its busy bit was on), then immediately tested the modifier bit and reacted. What I needed to do was to add a stage to the FSM for this flip flop. When I see the XIO Sense go busy, step to the next state where we wait until it goes off (the Sense is done) before looking at the modifier bit and either turning the FF off or remaining on.

With my fpga logic recoded, I went back out to test and verified that I had a reliable DSW stored with the operation complete indicated. The next problem I had was some hangup in the disk routine inside the DCIP utility, where it should be issuing a seek based on the cylinder I requested via bit switches. My guess is that my XIO Control routine (how a seek is requested) is not working properly thus not setting the operation complete for the seek.

I looked over the seek logic in the fpga to figure out the problem. This is an issue that arises from handling XIO Seek and XIO Sense locally (in the fpga) but the Init Read and Init Write up in the PC. The PC pushes the op complete status down to the fpga for the IR and IW, but it is necessary that my fpga logic does this for a seek.

I discovered that I was directly turning off the Busy status but not setting the operation complete flip flop, thus not triggering the interrupt. I adjusted the logic such that I now set the flipflop, whose reset by a Sense DSW is indirectly how the busy is reset. I am still concerned that I didn't see the proper cylinder reflected in my GUI, but that will be the next item for testing.

Much better now, but even when I enter a sector address of 0x0003 which is on the same cylinder as my current position, the results don't match. I suspect this is a flaw in my logic to 'seek' through the PC file and return contents, as I see data in the buffer, just nothing matching what is on the virtual disk at that location.

By dinner time, I had corrected the problems with the virtual seek on the PC side and was able to read any of the eight sectors of a disk correctly. I still had the problem where my seek did not appear to be working, which left the cylinder at 0 even when I entered a sector number larger than 7 on the bit switches. Pouring over the logic for triggering on a seek, modeling the timing, capturing the seek amounts and updating the cylinder number.

When I was handling the seek in the Python program, I never intercepted any of them. It makes me wonder if there is some subtle issue that is affecting my handling of XIO Control commands, either generally or for this specific virtual device. Part of my instrumentation will be a flag to tell me if XIO Control was ever detected, helping steer me towards the cause of the problem. Also plenty of eyeball time looking over all the logic to see what might possibly be going wrong. 

Saturday, January 9, 2016

Continuing to debut 2310 function, plus news from Finland


Johannes had a failed SLT board that had to be repaired. He was successful in transplanting the needed modules from spare boards to bring the repaired board back to life. The level of integration and technology in the 1130, 360 and 1800 is just at the edge of where they can be repaired even if no exact spare parts exist. Later generations of mainframes have embodied many of the logic circuits inside IBM assemblies such as the Thermal Conduction Module (TCM) or proprietary IBM integrated circuits for which no independent source exists.

Repair of the 360 era machines can be through transplant of IBM components, as Johannes has proved. This could even include creation of substitute circuitry consisting of germanium transistors, diodes and resistors to replace an SLT module (ceramic can that sits on the boards). Some of the parts on the SLT boards are discrete transistors and resistors or capacitors either singly or in multi-part components, all of which have replacements that can be sourced from other manufacturers.


The python program was modified to pause after it has completed a disk read or write, allowing me to put the 1130 in single step mode, check data areas and then step through the interrupt handler once I allow the PC side to continue. This will be a big help while debugging the new interaction between PC and fpga sides.

Testing pointed me at some issues in the FPGA logic which I will work on a bit. They involve the handling of the operation complete bit, which does not seem to be returned properly when the 1130 issues the XIO Sense Device. Also, the program is not getting far enough into its code to have issued the seek to a new cylinder.

I made some headway and updated some logic that I think was causing my problems. It was good to know that I was triggering the interrupt and then resetting it when the Sense DSW executed, which shows the mechanism is generally sound.

Friday, January 8, 2016

Working on virtual 2310 disk function in SAC Interface Unit


I took what precious time I had free during the last couple of days and carefully drew out my FSMs and other logic in the FPGA that I recently updated with new functions for the virtual disk drive adapter. I had to remove some from the PC side Python program because they needed tighter synchronization that I could provide over the USB link with a PC program.

It is essential that every XIO and completion of the disk read or write data transfer be properly implemented in my logic. Further, I checked that no FSMs would stall in inappropriate states.

When XIO Control (seek), XIO Sense Device, XIO Sense Device with Reset,  and XIO Sense ILSW occur, my logic must handle them wholly inside the FPGA. When XIO Initiate Read or XIO Initiate Write are issued by the 1130, I must set those in the "last function" field of the UCW and wait until the PC program has picked up that function via a poll transaction. I wait until I see Op Complete.

The PC has to accomplish the read or write from the PC file that is the virtual disk cartridge, use the 1130 cycle steal functions to do the actual transfer to/from core memory, and push a "DSW" down to the fpga that indicates the operation is complete, setting the Op Complete status. Any time Op Complete is on in the fpga, the interrupt level 2 is triggered. When a Sense Device with Reset is issued, it turns off Op Complete and the interrupt request.

This should interlock the operation of the PC and FPGA sides - while a read or write disk operation is underway, it is waiting for the disk hardware to make it happen. In our case, the disk hardware is the PC side, which does something and then lets us know via a push of the DSW bit assigned to Op Complete.

I saw a race hazard in the way that the op complete is set and reset in the fpga. Since the PC pushes a pseudo DSW with a certain bit turned on to signal op complete, that bit stays on until the PC pushes a second DSW with the bit off. During that uncertain interval, if the code on the 1130 gets the interrupt and resets the condition with XIO Sense Device, the fpga will go right back into the interrupt because the op complete condition is still turned on.

My solution is to have a flipflop for setting operation complete, which is triggered when the PC pushes the pseudo DSW and the flipflop gets immediately reset during the XIO Sense DSW. I ignore the DSW sent by the PC other than to flip on the operation complete flipflop, thus the PC does not have to push a second DSW. This resolves the timing hazard.

Monday, January 4, 2016

Seller shipped the wrong sized paper for the console printer

Work was quite heavy and I didn't get anytime with the system. .


The box of paper I ordered came. Well, a box of paper came. Rather than the 13 7/8 paper that I need for the console printer, they shipped 14 7/8 paper, which is useless to me. i have plenty.

Sunday, January 3, 2016

Flaw in 2310 function organization between PC and FPGA, changed and now under testing


The division of labor between the Python code in the PC and the VHDL logic in the FPGA may leave me open to problems such as I am experiencing while debugging the virtual 2310 disk drive function. There is no way to 'hold off' the 1130 processor from executing instructions thus I need the PC to be fast enough to handle events.

The core of the partnership between PC and FPGA is that the PC is the master of the link, sending a series of transactions from PC down to the hardware box and receiving an immediate answer. These include both action transactions, such as turn on interrupt request, and polling transactions, such as return the last XIO function code issued for this device.

The vulnerability is that a program might issue a second XIO instruction before I have successfully retrieved the function code from the first, thus losing any recognition of the earlier XIO. At this point, nothing will stop the 1130 from chewing through instructions including XIO whether or not I have retrieved the prior instruction's values.

If all the logic for the device adapter were pulled into the FPGA, I could guarantee that I handle the XIO before a subsequent one can be issued. The current divide between Python and VHDL is a natural divide with more sequential and programmatic actions implemented in the programming language while more parallel and signal specific actions are handled in hardware (VHDL).

In the case of the 2310 disk drive, I am polling for the XIO Sense commands from the PC. These are used to provide the current sector rotating under the heads, the status of the device and operations that were begun on it, but also to set and reset conditions that trigger interrupts. Once the interrupt condition is set down in the FPGA, the 1130 will begin executing code that will very quickly get to an XIO Sense instruction or two. If that sense instruction is followed soon after by a different XIO, such as a read, write or seek, I might not see the Sense function code at all.

Not all of the task of the virtual drive can be handled in the FPGA - reading the PC file, translating its data from x86 format bytes to 1130 words, and similar activities still would be best handled up in the PC. However, I am reaching the conclusion that the balance of effort might need to be tilted much more towards the FPGA.

I will rethink the disk drive design such that everything to do with status, interrupts, and XIO Sense is handled locally. The PC will feed data to the FPGA or accept data on a write, otherwise it will not be involved in the processing.

My current implementation has a PC thread that mimics the rotation of the disk platter, so that XIO Sense DSW will pick up the current rotary position that simulates a real disk rotating at 1500 RPM. This is easy to move to FPGA, but was only done on the PC because control of the DSW - device status word - retrieved by XIO Sense DSW was accomplished on the PC. This is a poor choice. Similarly, the PC was simulating the time it takes an XIO seek command to move the disk arm and settle, but I can model that better in FPGA.

I will need interlocking between the PC and FPGA sides of the adapter, but it will be reduced to picking up the XIO read or write by polling, and then polling for the rest of the data it needs to do the IO including the current cylinder (arm position) and other information from the XIO. It will never see the Sense DSW or be involved in interrupts.

The read or write operation requested will not complete until the PC side sends a 'done' message. The 1130 can continue to issue XIO Sense commands but won't enter the interrupt routine until the PC has finished fetching or storing the disk data using the cycle steal hardware.

This reallocation of responsibilities will be even more important for timing-critical devices like the 1132 printer. Here, interrupts are generated based on the spinning of a timing wheel and the processor has to respond by issuing XIO read, then setting up a fixed set of memory words with a bit pattern to cause specific hammers to fire. The right division with interlocking will make this bulletproof, while the wrong allocation is sure to fail.

By noon I had my modeling or the platter rotation and of seek timing completed and had moved on to restructuring how XIO Sense and interrupts are handled. Following that, I coded up the process to turn operation complete on and off. The next logic handled a seek completely inside the FPGA.

I still needed to track the current cylinder number, as it will be requested by the PC program when a read or write is attempted. That plus the interlock for when a read or write was completed by the PC will be the last changes needed in the FPGA. All this subject to testing, debugging and correction, of course.

After lunch I tackled the Python program, stripping out all the seek and DSW logic, concentrating it on its task of doing cycle steal to read or write a sector that was requested by the 1130 via the FPGA. By mid-afternoon, everything was ready for the first round of tests. I expect to go through several iterations of failed test, new instrumentation, new tests, and repairs.

After the first test, reading cylinder 0, I did validate the memory contents of the disk buffer in core. When I then attempted to read sectors from other cylinders, the contents didn't match. Not sure if this is a flaw in my 'seek' logic inside the PC file or a problem internally. I need to instrument the system to see if it has moved the virtual arm to the proper cylinder.

I also saw an anomalous read of one word before the full sector read, which could be the diagnostic program trying to verify the seek was correct by reading the sector number that is word 1 of each sector. I will add some instrumentation but also look over the relevant logic.

My tests weren't conclusive - when I tried to force a seek to sector x123 I remained looping in interrupt level 2, which I think means I didn't reflect the operation complete properly back to the software. Resetting, I went back to a simple read of sector 0, no seeks required. I observed the one word read, then a 'read check' which attempts a read but doesn't transfer the data, followed by a read of the full 321 word sector.

Overnight, I will read over the DCIP utility code to see if it does in fact do a read of one word, a read check and then a read. if it is not doing this, then I am not properly latching, saving and returning the parts of the XIO that I need to.

Saturday, January 2, 2016

Chasing odd issue with virtual 2310 adapter in SAC Interface Box


Continued use of the typewriter while testing the virtual 2310 is loosening up more crud (residual lubricants) and improving type quality slightly. Won't fix everything wrong with the 1053 but it helps.


The flaw I was debugging was an apparent failure to reset the last XIO command code as it was read by the Python program, so that when it next requested the last XIO, it got XIO IR again instead of code 000. As much as I stared at the logic to handle this in the FPGA, I couldn't see how it would fail to do the right thing. The answer was some careful instrumentation to flag whether my process to handle this is stalling somehow.

I also tossed in some special instrumentation in Python to warn me if I do receive a second IR command code without having first received some other value, either 000 or a Sense DSW which is how the op complete interrupt condition is reset.

I do see signs of double delivery of the XIO IR command but the FPGA side looks good. I will look over the Python code to see if there is any way I can get this condition due to an overlooked path in the code. I suppose I may have a race hazard occurring but this happens far too often to be something like that. It is puzzling.

I decided to make several changes to the involved logic in both sides, just in case there was a defect I couldn't see. When I retest tomorrow I will see if things improve.

Friday, January 1, 2016

Debugging virtual 2310 disk adapter and working on real peripherals


The distributor from whom I bought the console printer continuous feed paper is located fairly close to me, I discovered when they shipped the box yesterday. It will arrive Monday even though sent by ground transportation. With a name like Global Industrial and no obvious mention of location on the web site, I had no idea it was in Sacramento, a couple of hours drive from here. I wasn't anticipating the arrival anytime soon.

The DCIP program will print out the disk sector on the 1053, but it expects that tabs are set every six or so positions in order to fit all 16 words on a typewritten line. I tried to set the tabs by pushing my manual space button on the typewriter, but found it tabbing rather than spacing which is a definite defect in the spacing adjustment. Since spaces work properly when printing under program control, it is a maladjustment in the manual button setup.

The typewriter is getting better at unlatching at the end of a carrier return, and more of the returns are occurirng at reasonable speed, but I still have the other problems such as failure to latch the return reliably at the start of an operation.



On the other hand, my resupply of the Bondic material to repair the punch wheel is inexplicably delayed by Fedex. It was due yesterday and was in the hands of the nearby distribution center in Oakland, one hour away, but didn't arrive. The message from Fedex says "appears delayed should arrive no later than Jan 1". Huh? Then the doorbell rang in the morning and Fedex had come through.



I found my logic flaw that caused me to stuff memory incorrectly from a disk sector I read with my virtual 2310 adapter - I had been storing twice as many words as requested, with every other word consisting of incorrect values. It was indeed a defect in how I was handling indexing through a byte by byte array on one side and a 16 bit word array on the other.

The test on the 1130 worked much better, but there is still an issue. Although I step through the entire sector buffer and convert the values, I found that only the first half of the disk buffer was filled correctly - 160 words - while the second half bore no resemblance to the real disk contents.

The trace of the data I was processing showed that this defect was reflected in the data I converted from 1130 simulator file format to the array of words. That is, somehow the second half of the disk sector buffer is not valid. Problem again stemmed for two sizes (321 and 641) for operations. Fixed.

My next test showed that I was indeed reading the sector correctly in storage, then my python code pushed an operation complete to the FPGA. It does not appear that the FPGA requested the interrupt to handle the op complete, however, so my attention will turn to that logic for a while.

I also attempted to do the same test but entering a known sector that would require the diagnostic program to do a seek before attempting the read. It is likely that the same flaw in interrupts occurred here, because it never tried the read. I will put in suitable instrumentation to watch what is happening in the seek code.

Running with diagnostic messages on, I tried to do a dump of sector 0x0123 which should force a seek before it reads. What I saw was:

  • XIO IR with check (meaning dont' transfer data issued
  • Op complete sent for the IR Check
  • DSW Reset issued indicating it saw the interrupt
  • XIO IR issued (should have had seek here)
  • Read of 321 words into memory
  • Op complete sent
  • XIO IR issued again (no DSW reset sent)
  • read of 321 words again, not the correct data this time
  • Op complete sent for this IR
  • DSW reset issued. 

I should not be processing the XIO IR a second time. My logic that fetches the last XIO command should zero out that field such that I get back 000 and ignore anything until the next real XIO occurs. Instead my transaction is given a second XIO IR in a row without the system having reset the first op complete.

While it is dimly possible that the diagnostic is issuing XIO IR from within the interrupt handler and is aware it hasn't responded to the previous Op Complete, I don't understand how it would check for the second read to have completed in this situation. Therefore, I believe this is an artifact of my logic in the FPGA or in my Python code in the PC. The answer is more instrumentation.