Saturday, July 23, 2016

Prep for VCF West, testing, boot up and running DMS2 from virtual 2310, virtual 1422 and real 1132 peripherals, bad memory card

DIGITAL GAME MUSEUM TIME

I am a board member of this museum and visited today to talk with another board member and restoration staff. Recently the interior was reorganized to make the collection more space efficient, using rolling shelving, and to establish a playable game space in the back. It looks great and should make for a more enjoyable time for visitors to the museum. I was back by mid-afternoon and working on my 1130 systems.

1053 CONSOLE PRINTER RESTORATION

I ran the keyboard/typewriter diagnostics a few times, which helps loosen up the machine and make it type more accurately. Already the carrier return is more dependable, although I still have erratic line advancement due to goopy adhesive in the mechanism. Typing itself looks pretty good now.

SAC INTERFACE FOR ADDING PERIPHERALS TO THE 1130

Restructuring the GUI

-- mirror 1053 console printer and 1131 keyboard --

First up was the fpga engineered with appropriate diagnostic LEDs for the state of the mirror 1053 driver. It exhibited the special USB bit rot, dropping bit 6 this time, rather than bit 4. I made some other changes to set up arming and disarming of the 1627, 1134 and 1055 real devices from the GUI. Unarmed devices can't trigger interrupts.

The changed code went through - no spurious interrupts and no dropped USB bits. My test, however, uncovered a flaw in my GUI code in the sequence in which I look to see if I should arm or disarm the device. I moved the code up to the right place. Too, I found a few places where the state machines for the mirror driver would take action, yet they should be inert if the driver isn't armed. .

My next test resulted in a hung transaction engine, perhaps when I was firing off the arm and disarm codes. I needed to look over this part of the fpga carefully, then spotted that I wasn't ending the write special data transaction.

I fixed the problem and saw that the device armed properly. However, it didn't disarm when it should have. Further, when I tried to load core I had the bad bit problem over the USB so had to rebuild to clear up the spurious error. I did spot the problem with the disarming, which is only in the GUI so I have to make some change to the fpga just to force a good synthesis.

The mirror device worked pretty well. It missed about one character in 30 when the program was typing at full speed. I need to look for a flaw that will lose a character sporadically. Otherwise it looks good, except for an error check issue on the arm and disarm push transactions.

-- virtual 1403 printer --

The next device I debugged was the virtual line printer. It will be helpful if this is working for the exhibition, to give attendees a PC file with the printout of whatever they run on the real 1130. Also, it will free the 1130 from the heavy CPU load required to drive the 1132 printer.

I set up a print line and test program to print a line and space, over and over. It appears to be going to the next page, not skipping a single line. There is also a small flaw with the error check on DSW pushes. I will set up 1403 oriented diagnostics

PREPARING FOR VCF WEST EXHIBITION AUG 6-7

I had listed work to do on both the real and replica 1130 systems in yesterday's post. Today I worked on the cable holder for the memory ribbon cables and replacing the front plate in the kickspace below the keyboard. The cable holder had been glued to the underside of the typewriter pan, above the logic gates, but had come loose. The pan is a very glossy and slick enamel, not very suitable for adhesion.

Instead, I cleaned, roughed up and prepared the inside of the kickplate then epoxied the holder there. The memory cables will hook into the holder there and be out of the way. Once it was very solid, about an hour after gluing, it was ready for the kickplate to be reattached. This plate does in the footwell under the keyboard and typewriter, forming the visible cover when someone looks in from the front.

I couldn't find three of the four original bolts that held the kickplate on, so it was off to a hardware store to get what I needed. While I was there, I bought a 4 x 8 sheet and had it cut into the five covers that will go on my 1130 replica. These will be painted - four in medium-dark gray pebble finish and one in either garnet rose or classic blue, whichever of those colors looks more authentic.

The painting will take place tomorrow, first a sand texture coat and then the color coats. I will leave the backsides plain in order to glue mounting brackets onto them. I have a notion for how to mount the panels on my metal frame.

The kickplate is in place and the memory ribbon cables securely engaged in the holder that I epoxied to the plate. I used a couple of rubber bands to keep the cables from slipping out of the holder. The plate is solidly in place and cables routed properly.

The remaining two tasks on the cosmetics list are to secure the 1053 power connector to the SMS connection panel deep inside the 1130, and to repair the mounting screw for the power connector from the 1132 printer.  The SMS connector panel is now buttoned up nicely, so all that remains is to figure out what parts I need for the female power connection on the 1131, so that the male power connector from the 1132 will mate with it.

I powered up the 1132 in order to check out its operation, booting DMS2 from a virtual disk cartridge and running jobs from the virtual 1442. It booted fine, printed the welcome banner on the 1132 and I fed in a deck of cards with a job to compile some fortran code. The machine was in the midst of the fortran compilation when it stopped with a parity error. Up until that point, all was going well.

running a job from virtual 1442 on system booted from virtual 2310
This could be an issue I have to deal with concerning my virtual devices,although more likely this was residual junk in high storage and a flaw in some software since the address being referenced was up above the 8K line. The DMS2 disk pack for the 1130 simulator was built for a 16K machine which should mean no addresses beyond 0x3fff will be generated. I will clear memory so that it writes correct parity everywhere and try this again later.

To check out the memory, I made use of the CE Switches built into the 1131, specifically the Storage Load and the Storage Display switches. These will cycle continuously through all memory addresses, writing the console bit switch values into memory, or cycle continuously reading memory. Even when good data is supposedly written into memory, the Storage Display runs into a parity error when it reaches a memory address in the second 8K.

So this is good - I don't have a flaw in my SAC Interface or virtual devices that causes parity errors - but bad - because there is a hardware problem I need to fix. As a safety measure, I will generate a virtual DMS2 system with 8K of memory that can run at the show without touching the upper 8K. I will make sure that this 8K system boots up and runs several jobs to completion on my system.

The 8K system boots, confirms it is 8K, but something branches up to 0x3xxx and hits the parity error. At first I though it was a problem with the Fortran compiler, so I set up a deck to execute DUP but that had the same failure. This needs investigation.  See section of memory failure.

I have a few advertisements for the 1130 and some textbooks, which I will bring to add to the flavor of the exhibit. I need to figure out how to display them, secure and protect them, and keep track of them during the show.

1131 CORE MEMORY FAILURES

I walked through memory addresses and determined that the parity errors happen for specific ranges of addresses in the second memory compartment (addresses 0x2000 to 0x3fff). Any address in that compartment that has address bits 9 to 11 is 0b011 will fail. 203x or 20Bx, 283x, or 28Bx, 303x or 30Bx, and 383x or 38Bx. I have to find a card related to decoding 011 for drive lines and figure out what is wrong with it. 

Friday, July 22, 2016

Virtual 2310 disk drive working, plus preparation for VCF West exhibition of real and replica 1130 systems

SAC INTERFACE FOR ADDING PERIPHERALS TO THE 1130

Restructuring the GUI

-- virtual 2310 disk drive --

I had discovered a problem with the way I was modeling the busy bit for the DSW. It led to an error stop in the DMS Cold Start card program. I worked out a more faithful method and implemented it, ready for testing this morning. The virtual device will go busy as soon as it sees a seek, initiate read or initiate write command and will switch off when the operation complete status is triggered.

I had also switched around the diagnostic signals on the LED to the most useful ones related to the virtual 2310. My first attempt I found the sporadic bug where Vivado leaves me with bit 4 of the USB not working, but a new pass of the same logic through the toolchain should clear it up (wasting 15 minutes). Alas, same problem so trivial change and ran it again. Still bad. Made a more substantive change and tried that.

Finally I was free of the phantom missing bit 4 problem and could resume testing. I made it quite a bit further past the earlier point in the boot card sequence, with the system stopping at location 0x01FE which I think is somewhere in the boot sequence. As well as this problem, I noticed that if I attempt an XIO to the disk drive when it is not ready, it still kicks off the state machines or tries to perform the action. I need to make it obey the not ready condition more fully.

With those changes in place, I went out in the afternoon for more testing. First, I wanted to verify that it was inert as far as XIO when the device is not ready. Then, I wanted to investigate the behavior of the boot sequence more fully.

I had closed up the SAC Interface box, in order to have it ready for transport and exhibition. I don't know if I dislodged some wiring or whether this is a random Vivado corruption of the bitstream, but when I turn on the 1130 the SAC box is commanding interrupts on IL3 and IL4. I attempted a change to the logic and a new bitstream generation just to rule this out. If it is not the toolchain, then I have a hardware problem in my interface box.

Another idea came to mind. I had a highspeed link between an auxiliary fpga and the main one, in order to handle physical peripherals such as the 1627 plotter and 1134/1055 paper tape devices. I removed it, taped it over and put it inside the SAC unit.

However, if spurious signals on the line are interpreted as device signals, they could be generating interrupts. The devices just mention happen to interrupt on IL3 and IL4, which does match the symptoms. I blocked the interrupt requests for those devices and resynthesized.

I still have hot requests for interrupt levels 3 and 4. I decided to work on eliminating the devices in swathes until the IL level is gone, then reintroduce until I figure out what device is giving me the problem. At least the testing is quick - power up the fpga and the LED goes on right away if the problem persists.

With only eight of the 20 devices left, I still had a hot IL4 signal. I moved to the extreme, removing all of them so that the line is held low. Now I have seen everything. I have a signal, IntReqLvl4, which is set to '0' yet the diagnostic LED shows it '1'. This makes no sense. I am spreading the signals to other LEDs as well, for no better reason than to insist on corroboration.

With the signals spread over the LEDs, I could see that only the one I 'thought' was IL4 is still lit. I powered up the 1131 to verify and indeed there are no interrupt requests presented. I then went back and reset all the twenty devices back into the IL4 chain and resynthesized.

Every device back in, yet no interrupts being requested. It was one of two causes - either the high speed link issue for plotter and paper tape was bad but I fixed it earlier, or the toolchain is at it again. At this point, as long as it works properly I will get back to testing the disk after four hours of frittered time.

Something is not quite right, yet, although it may have worked. I saw the boot program read from disk into 0x00d0, the read the next sector into 0x0122, and finally read sector 2 into 0x0004. This is when the look sat at 0x01F4, possibly waiting to print to the 1132 line printer. I should do this test with the printer powered up and ready, just to see what happens.

I am not entirely sure if the disk image was set up for a 16K machine. If configured for a different memory size, it may not work properly even if my disk emulation is perfect.

I took the opportunity to load the keyboard/console diagnostic, which worked well. The 1053 types great, although its carrier return is still defective. It needs a helping hand to move it back to the left. The keyboard worked right, also.

I need to do some investigation of the boot sequence and of the disk image whose PC file I am using, to determine whether things are working okay or not. I may have some timing related problems in the virtual disk, particularly if it is too quick on the read.

The sectors being read are the correct ones for a boot. I need to do more testing of my disk device to ensure it is really working properly. After closely examining my GUI code, I think I see a flaw that might have caused some of the problems. It is changed and I am back out to test.

Success! DCIP began typing out the sector I chose on the console printer. It looked really nice, so time to bring up a test boot of the DMS disk once again. This time, it flew through quite a few fetches until the full DMS2 was up and waiting on device 6 - the 1132 printer - to become ready so that the DMS banner could be printed.

-- mirror 1053 console printer --

The next device to work on is the mirror 1053 function, which should capture a replica image of what is typing on the console, placing the contents in a PC file. This uses the FIFO, the remaining Xilinx IP unit that is giving me the obscure 'black box' error messages. I will begin debugging this tomorrow, while I work on the remaining cosmetic and testing items for the real 1130. When that is done, I will move on to the 1130 replica and its readying for the exhibition.

PREPARING FOR VCF WEST EXHIBITION

I have a few tasks to prepare the 1131 - the main unit of an 1130 system, consisting of the central processor, console printer, keyboard, internal disk drive, display pedestal and other controls:
  • Attach cable holder and route the cables from the processor gates (A and B) to the 'blister', the expansion frame to the left of the keyboard where the memory sits in larger configuration machines. 
  • Refasten the clamp holding the 1053 console printer power cable to the power connection point inside the 1131
  • Replace the front facing machine cover that sits under the desk
  • Work on replacement threaded screw for 1132 power cable where it attaches to the rear of the 1132
My replica 1130 needs several tasks to make it ready for exhibition:
  • hardboard panels cut and painted to fit on frame in place of metal covers
  • prime and paint metal top of replica to pebble gray IBM color and texture
  • reload fpga and reconnect cabling to pedestal, keyboard, other controls
  • shorten pedestal stand and anchor in place for show
  • decide on appropriate printer mechanism for show
The replica consists of a life size welded frame shaped like a small memory 1131 (no blister), a display pedestal, the formica desk top, keyboard and other controls. I have a roughly bent sheet metal top that fits on the frame and under the formica white desk slab.

Link to video of 1130 replica running (viewing display pedestal)

Keyboard and control panel before mounting on frame
The pedestal with the blinking lights and rotary control is temporarily mounted on wooden stands which need changing to set the unit firmly at the right height and position. I have an IBM Memory typewriter sitting on the metal to serve as a 1053 console printer, but I don't have the cover manufactured for it yet. Similarly, I have the console bit switches on a mounting plate but not the final plate that sits in front of the 1053.

Today I tried to buy the hardboard panels and get them cut to serve as the 1131 covers. The first Home Depot had the hardboard sheet but their panel saw was out of service. The second Home Depot a few miles away also had the hardboard sheet and also had an out of service panel saw. The Lowes a few miles further had a working panel saw, but zero hardboard panels in stock. At that point, I gave up for the day and went back to testing on the 1130.

I closed up my SAC Interface box for transport and use over the next few weeks. I have the temporary diagnostic LED outputs emerging from the front of the case along with the USB cable although eventually I intend to install a more secure USB connector and the LEDs will be replaced with links to various FPGA boards such as disk drive controllers and I/O fanout units. 

Thursday, July 21, 2016

virtual 1442 reader and punch working properly, moving on to other testing and work

SAC INTERFACE FOR ADDING PERIPHERALS TO THE 1130

Restructuring the GUI

-- virtual 1442 --

Eureka! I have the copyover (mostly) working now and the off-by-one problem is gone. Unfortunately, I have regressed to the problem where the first column of the pre-read buffer card is copied to every tenth position of the pre-punch buffer, replacing what belongs there. This was an error in my copyover logic that has returned now that I rewrote it completely.

I also have a minor error with the write special data transaction where I wasn't returning word 2 as it had been sent, but that was easily rectified by a snippet of code. Otherwise, this transaction has been working properly for some time.

Once I clean up the every-ten-overwrite problem, I can start debugging punching operations. I need to test with a blank deck of cards (input hopper file that is all blanks), punching a full card and then a short one. The 1442 will check each column being transferred for punching and when bit 12 is on, it ends the punch operation.

If blank cards work fine, the next test is loading a file with data already in the cards. The result of punching should be a merge or row-wise OR of both the original card and the new punched pattern. This will be checked for both full card and short card punching.

The tenth position error is the first word of each get special data transaction, which repeats the value in column 1 rather than starting over with the value in the column 10, 20, etc. It is caused by a lack of adequate time for the pre-punch buffer address to set up from the value delivered in the second word of the incoming transaction, so that it still uses the default value which is column 1.

I added another cycle between the startup and when I latch up values, which should give the buffer address time to settle on the actual start cycle. My next testing opportunity in the late morning showed that it was now working just as planned. Whether reading or feeding, the card images were faithfully copied over and then written to the stacker file on the PC. If reading, the columns were faithfully stored in memory.

Onward to the last items to test - punching cards and the stacker select function. My first try at punching cards did not produce any results. My guess is that the pemitter process didn't work properly, hanging up and keeping the main device process hung as well. I instrumented the LEDs to check various status signals and ran a test after lunch.

It confirmed what I suspected - the main device state machine got to the point where it triggered the punch emitter process, but that process hung. I did see that a punch operation was correctly recognized. Looking closely over the punch emitter process, I found two problems. Both were related to the timing that emulated the delay of a 1442 punch.

I had set up a counter to model the time of the card movement during a punch, but initially the times I used were the same as reading. The punch is considerably slower than reading, however, so that eventually I calculated the correct counts for the timers. I did not increase the size of the counter.

The counter would run from 0 to just over 1 million, which could model a 20+ millisecond delay. In fact, punching incurs an initial delay of 40ms, so I changed the test in the state machine to 2000000 yet the counter only went up to 1000000. It wrapped and cycled in an infinite loop.

The secondary bug was that I didn't reset the timer back to zero before modeling the 6.25ms per column delay, so it would have run from 2000000 upwards. I didn't reach this point because of the earlier bug and infinite loop.

I changed the timer to run up over 2 million, to cover the initial 40ms delay, and made sure to reset the timer before each column delay and before the final delay for card movement out of the punch station. With this changed, I went back to testing on the 1130.

I got punching over the blank card images. My only problem is that the results were off by one, which means I have to focus on my logic for the XIO Write and pre-punch buffer accessing. This is very encouraging, as I realize that I am pretty close to done.

I ran a quick test with bit 12 set on in the 10th column, just to verify that it will punch a short card.  It did, off by one of course but otherwise correctly stopping and leaving the remaining 69 card columns blank. All that is left to verify is:

  • get it aligned to the right columns, 
  • verify the punch merge function
  • verify the stacker select function
Punch merge is a realism behavior I added which will combine the holes already existing in a card coming from the pre-read station with the rows that are punched during the punch operation. If a card deck were put in the hopper of a 1442 that had row 12 punched in all 80 columns, and the program were to punch row 3 in all columns, the resulting physical card would be a string of C characters (rows 12 and 3 punched in each column).

Stacker Select is a function that a program can issue (XIO Control with bit 8 set) that will arm the stacker mechanism so that the next card coming out will be ejected into the alternate stacker, rather than the primary stacker. I open twin output files, once for the primary stacker and another for the alternate, so that if the stacker select function was issued, I will write the card image to the alternate instead of the primary file.

Card movement in 1442
The Stacker Select is not completely accurately modeled. It should only take place during the time that a read, feed or punch is active and before the card would have reached the stacker pathway. It is reset by the completion of the feed, read or punch cycle. My implementation, however, will arm the code to put the next card ejected by punch, feed or read into the alternate stacker, no matter how long the time from select issuance until the actual card movement occurs.

The reason for my off by one problem is that I was triggering the write into the pre-punch buffer from within the print emitter loop, which only determines when we set off IL0. It is the subsequent issuance of an XIO Write instruction by the software in the 1130 that determines the column contents, so I will trigger the write of the buffer as soon as I grab the data value from the E3 cycle of the XIO Write instruction.

Problem solved. Punching reproduces the core contents exactly into the cards and short punching works correctly also. Time to check over the logic for punch merge and for stacker select, then do the final testing. Everything looks ready for punch merge, but it needs testing.

I discovered that my logic in the GUI that was checking which type of XIO Control was not working properly, apparently I am not sending the proper data up from the FPGA. Inspecting the code makes it appear that I should see the proper modifier from the XIO Control, but the results differ in real life.

My late afternoon testing was to test the punch merge operation, which would take a hopper deck with known contents and a prepared punch area to see if it merged properly. In addition, I tried out the stacker select operation.

Everything worked perfectly. When I open a file for the input hopper that already has holes punched in the cards, they are properly merged with the holes being punched by the program. Stacker selection causes the next card to be fed, read or punched to go into the alternate stacker file on the PC, rather than the primary 'stacker'.

This wraps up the virtual 1442 capability. Later, after I get my system back home, I will work on the mirror 1442 capability, which will capture cards as they are read and/or punched on a real 1442, storing them in a PC file. I don't need this, but it will be quite useful for other 1130 operators such as the National Museum of Computing in the UK.

-- virtual 2310 --

I attempted to test a boot of the virtual 2310 disk drive, using a cold start card booted from the virtual 1442. When the cold start code began, it stopped with the keyboard select lamp illuminated and an interrupt on IL4, which is definitely not what should be happening. I created a memory load of the boot card, just for convenience, as I chase down this flaw.

One possibility was that I have a defect in my mirror device logic causing it to interfere with the real device operation. The boot card reads from the console switches to decide which drive is being booted, so if this is inducing an error while the XIO to the console switches takes place, it might explain the symptoms.

I decided to hand step the boot card logic to see when the problem arose. The logic of the boot loader is to check that the busy bit went on for the drive, otherwise go to the invalid boot device wait at 001E. Single stepping will cause this to happen, even if the disk is valid, because the XIO Sense happens after the I/O is complete.

I have to verify that my logic is setting busy correctly and holding it long enough to be seen by the boot card program. Inspection of the program shows that it is not accurate as far as a disk read or write is concerned. I will display busy during a seek, but when the operation is a read or write, I shut off busy as soon as the XIO IR/IW completes, not when the disk is done. Will have to modify the design a bit. This is what I will be testing starting tomorrow.

CLEAN UP MACHINE COVERS IN PREPARATION FOR MOVE AND EXHIBITION

I replaced all the internal cosmetic covers in the 1442 - these are plastic plates that cover various parts of the machinery. I then replaced the front skin that had been removed while I worked on the machine. Still to do for the physical 1442 - fix up the status lamps and adjust the card feed into the stackers.

The lamps suffer from the same age degradation where the leads on the lamps snap off with the slightest movement, making it nearly impossible to get a full set of working lamps back behind the status panel. I left them loose for the show, since they are hidden inside, and will work on this when the equipment is back home.

The physical cards move from hopper to pre-read, through the read station to pre-punch, through the punch to the cornering station, then out into the stacker mechanism. The cards make a right angle turn in their motion at the cornering station. Cards are not reliably turning and moving through the stackers, which is a matter of adjustments of a few parts at the station and stacker. This is something I want to adjust before the show. 

Wednesday, July 20, 2016

Still fighting 1442 copyover plus replacement 1403 printer chain created

1401 RESTORATION TEAM WORK

Wednesdays are the day I visit the CHM to meet with the rest of the team and repair anything that is not working on the two 1401 systems. We had two problems related to one of the 1403 printers - a driver card had a blown fuse, and one of the print solenoid coils that fires a hammer partially shorted and burned.

I repaired the driver card, using new-old-stock fuses we have to solder it in place after removing the blown component. Other members cut off the burned coil, soldered in a new-old-stock replacement and put the printer back together.

In addition, we received a 1403 type chain, manufactured by one of the team at the museum in Binghamton, NY. We will install this in a train case and try it out on one of our printers, as a test of the process he used. If it works, it will allow us to make multiple print chains for use on our two 1403 printers and the 1403 printer at the Binghamton facility.

We have been worrying about the fragility of the chains given the complexity of making replacement parts, so this is an exciting potential source to allow us to keep these systems running indefinitely. Once the chain passes tests, we will send a couple of people from the CHM restoration team out to NY to learn how to make more.

image1.jpeg

SAC INTERFACE FOR ADDING PERIPHERALS TO THE 1130

Restructuring the GUI

-- virtual 1442 card reader and punch --

I tested the feed operation - will advance a card without emitting IL0 interrupts or causing reads to take place. My logic worked properly for this as well, so that both card reading and card feeding are solid.

I still have a problem with the copyover. The data I see coming out of the pre-punch memory for card column 2 is the value that was in column 1 in the pre-read memory. They use the exact same address and I see the correct value on the input lines to the memory when it is written a cycle or two before I sample it. Inexplicable.

At times like this, when I can't find any logical reason for the flaw, I try rewriting the section of logic, often with a slightly different approach, hoping to eliminate whatever obscure cause was generating my problems. I decided to change way I locked in the buffer memory addresses, even though the exact same string is used for both memories.

I was setting up the address with a mux, selecting one leg when the copyover process has moved away from idle state. The leg has  std_logic_vector(to_unsigned((copyidx+1),7)) as the source of the address. This takes the integer counter copyidx, adds 1 to it, then converts it from an integer to a SLV with one signal per binary digit comprising the integer value. This was implemented at a time when I had a special use for buffer address 0, but it is no longer relevant.

It should take the value of copyidx, which is stable for several clocks around the time it matters, run it through an adder with a constant of one as the other operand, then route it as the memory address. I don't know how it could be getting this wrong, but I will change all my logic so that I use the index copyidx as it is, ranging from 0 to 79, eliminating the need for an adder in the address mux.

I also put in some diagnostic LEDs to compare the value in the data in lines and the data out lines for the pre-punch buffer, at the time we are looking at card column 2 in the copyover and have completed the write. I am finding that I am unable to update the pre-punch buffer memory even though I trigger the write and set the write enable flag.

I did see a type mismatch - the replacement VHDL I wrote to infer block ram used a simple signal as write enable, but the block box IP I had been using had a std_logic_vector of length 1 instead. I cleaned up all the code to be consistent and retested.

I still see a mismatch between the pre-punch memory input and output - indicating that it is not storing the right value. It agrees if the input is 0, but if the input is a 1 then we see a 0 on output. Oddly, some conditions give me a 1 on the output, but not the cases where I expect it to happen.

For the time being, I moved to the get special data routine which is failing to pick up the data from the pre-punch buffer (or that data is always blanks). I continue to hit a brick wall on this.

With some lingering suspicion that the block ram may not be working for some obscure reason, I set up the block memory module to force it to use distributed RAM rather that block RAM. This means the memory will be implemented using lookup tables throughout the FPGA rather than the dedicated block ram units on the chip.

Sadly, the results were exactly the same. I have cleared the reputation of the block RAM and toolchain, Back to allowing it to be implemented as block RAM and back again to the struggle to figure this out. I keep trying to change things in the hope that it resolves the problem but no luck so far.

1442 reading perfect, but still copyover problems

SAC INTERFACE FOR ADDING PERIPHERALS TO THE 1130

Restructuring the GUI

-- virtual 1442 --

Made a few tweaks to the copyover logic, particularly the way the buffer memory addresses are created. Now they are registered to avoid passing glitches as various selection signals transition, particularly state variables which might pass through unwanted states momentarily.

This made zero difference in the results - still blank cards coming out of the pre-punch buffer when I fetch them to the GUI. I validated once again that the card image in the pre-read buffer is properly read into core during a 1442 read operation. Still unclear whether this failure stems from the copyover process that moves a card from pre-read to pre-punch, or whether it is a defect in the get special data transaction fetching the pre-punch buffer.

I went back to the diagnostic traps, trying to catch column 2 during a copyover, setting one lamp if the high bit of that column is 1, a different lamp if the high bit was 0. At least one of the lamps will light if I catch column 2 during a copyover.  This tests that the pre-read buffer still has the proper card image at the time that the copyover is running. As you will see, this is a time consuming, exhaustive search for the place where the problem starts.

My first tests proved that the data was coming out of the pre-read buffer during the copyover process. The open questions are whether it makes it into the pre-punch, whether it is still there on get special data fetch, and whether something overwrites it before delivery.

Next, I moved the traps to see what the high bit of column 2 of the input to the pre-punch buffer is seeing at the time that the copyover is writing to that buffer. This will tell me whether the data is presented to the pre-punch buffer for writing.

This too worked properly. I have two decks to run into the 1442. One has the 12 row punched in column 2, while the other does not. The various lamps worked as expected, thus the data is set up as input to the pre-punch buffer during copyover.

Next, I set the traps to tell me what is coming out of the pre-punch buffer after it was written, for the high bit of column 2 during a copyover. This will confirm it was actually written into the buffer. The results highlighted where the problem lies. The value in column 2 of the pre-punch buffer output is the same as what was written in for column 1. Some interaction here is introducing the off by one problem. Clearly, that is separate from the other problem where only blanks are returned to the PC.

I changed quite a bit and still see the same outcome - no idea why this is not working, but it could be some flaw with how Vivado is implementing the three built in block ram instances. There is an obscure message about problems with black box instances, which I can't sort out even after reading all the documents and web pages related to the message.

I decided to infer the block ram with VHDL, coding up a module 'blockmem' based on the exact template from Xilinx documents. This way I will avoid the 'black box' confusion and should have discrete memories for all three buffers (two in the 1442 and one used for the carriage control tape of a virtual 1403).

My first try to test this resulting in the spurious lack of bit 4 on any USB transmissions, which I cure by making a trivial change and resynthesizing to replace the corrupted bitstream with a good one.
Bit 4 problem went away, but still not seeing the stored value from the pre-punch buffer that I see on its input. Time to look closely to make sure I am actually toggling the write enable - not sure what else could be the problem.

I did more tweaking and checking of the behavior when the hopper runs out, but with and without the last card box checked. Everything is working like a charm. At this point I have a good card reader, but it won't produce the stacker output file that will be essential for punching.

Time to rebuild the copy over process in its entirety. More tomorrow. 

Monday, July 18, 2016

Continuing to debug and improve 1442 device, also working on exhibit at VCF W

SAC INTERFACE FOR ADDING PERIPHERALS TO THE 1130


Restructuring the GUI

-- virtual 1442 --

I need to send the changed status for the 1442 down to the FPGA prior to the end of the current read, punch or feed cycle, in order to have it post properly with the DSW when the operation complete interrupt on IL4 is emitted. This is a small change. The Python code that sends the NR or last card&NR DSW has to be moved before I fetch the pre-punch buffer. That will set the conditions for the device. No change is needed in the fpga.

My first priority is to find and fix the bug that causes cards in the stacker file to be offset by one column. I spent another half hour walking through the logic for both copy-over and fetch, hoping to notice the problem. I rewrote the logic and added a few LED outputs, then went to test again.

Once again the annoying problem with failure to get a valid poll response back. I rebooted the Windows laptop just in case this is a USB driver and windows issue. When I restarted, I had the same problem. Interestingly, it is related to the spurious problem yesterday when core load wouldn't work.

I sent 0x5d48 to the FPGA as the command word, which is a poll on UCW 11 (device number as implemented in my logic, not the 1130 area code used to address it). However, the reflection back had it as 0x5548. That is, bit 4 when numbering starting from the left, was 0 when it should be 1. My core failure yesterday was loading core but failed to verify because any word with bit 4 set was loaded as a 0, not a 1.

The only things in common are the USB path for transactions - the bit is dropped somewhere between where I load the words to send and when it arrives in the FPGA. That could be python failures, USB driver software, the USB module on the fpga board or the signal line from the USB to the fpga chip for that bit position. I think USB does some kind of checksumming of transactions, which points to a broken chip or trace on my $200 FPGA board.

The problem arose yesterday morning and went away spontaneously by the afternoon. Lets see if it does the same today. Alternatively, I will:

  1. reload the same bitstream just in case it is a failure in the flash load
  2. make a trivial change to the fpga logic, resynthesize and load
  3. build a transaction in the GUI to echo to the fpga, allowing me to validate the state of the link
The first test delivered the same results. I will save the bitstream file before doing test two above, so that I can make sure any improvement is not a random hardware healing. After I changed a few LED connections but otherwise left the design alone, I loaded the resynthesized logic and the problem went away!

I went back to the saved bitstream to confirm that the problem recurred. This is an artifact of the toolchain. Nothing in my logic should be timing sensitive concerning the input bit 4 from the USB module. Now I have to consider any particular run of the toolchain potentially corrupt, which means that I have to do two runs with minor changes and produce dual bitstreams, in order to know immediately if the flaw is in my logic or spurious Xilinx effects.

My last test yielded a different problem with the copy over process - now I get column 1 repeated three times before the rest of the card is copied. What????? Will need to test for Xilinx disease before I try to troubleshoot this. Made a trivial change and reran the toolchain.

Repeated with two versions of bitstream, so the problem is real. I am going to rethink my method of loading up the response to the get special data transaction. I build up a temporary array of 10 words as the get special logic loop fetches from the punch buffer, then when the loop signals its end I copy from the temporary array to the output words to send back to the GUI.

With the new bitstream loaded, I fired up the system, ran through some cards. and found that I was fetching nothing but blank columns. Time to look closer at my logic to capture the value of the punch buffer during the fetch loop.

Couldn't see why this isn't working, but rewrote it anyway. Still fetching nothing but zeroes. Perhaps their is a problem over in the copy-over process? In case there are glitches in the buffer address lines I registered those. No change, however.

I did some testing of the last card processing logic. It worked exactly as desired when the last card box is checked - one the last card, the DSW has last card set, op complete and not ready. When I turned off the last card checkbox, it didn't stop on the next to last card, it emptied the hopper entirely but did at least show op complete and not ready in the last DSW issued.

I updated my GUI logic to recognize when the reader has one card left in the hopper, if last card is not selected, and turn the device not ready at that point.  I ran out of test time today but will resume tomorrow.

PLANNING FOR VCF WEST EXHIBITION

It is less than three weeks until the event. I need to prioritize my work on the SAC Interface box, as I only need the devices which will be part of my exhibition. That will free up time for other work prepping the 1130 and preparing exhibit signage. My wish list is virtual 1442 reader/punch, virtual 2310 disk drive and mirror 1053 console printer devices.

A 12 foot van with lift gate is waiting in my name. Will get it Friday, load up and bring everything to CHM that evening. The show runs Saturday and Sunday, with move-out the final evening. Much easier than my prior plan to use a horse trailer, which would have involved two trips each way, winches, ramps and sweat. Plus, I don't have to disassemble the 1130 into two parts.

 I need to script out a few demonstrations and walk through them, so that everything goes smoothly during the show. With that, I can figure out a schedule for performing them. I anticipate that I will also allow a few visitors to prepare card decks on the PC for execution on the 1130. To support this, I will have Brian Knittel's 1130 simulator available for them to test out everything in advance.

I hope to bring my 1130 replica as part of the exhibit, which will require some work to get it ready. I need to finish some cosmetics - painting the metal top cover for certain, but I hope to build quicky side panels and paint them, to improve the visual appearance. I will look for some kind of hardboard to use for those panels. Essential, before I bring it, is to hook up the FPGA, load its firmware and make sure all my wiring is in place.

My 1130 has some problems with the lamps in the display pedestal. At a minimum I have to carefully close up the compartment but if I see a very safe alternative I will replace the missing few.  The lamps go into boards which are very finicky to handle and insert, while the lamps themselves have brittle leads that snap off with the slightest flexing. After some peering, I chose to close it up as-is.

I spent some time creating signage and printing it out. Some discuss the 1130 and the replica, others cover the demonstration times or label the physical boxes present. I then drew out the cutting diagram to turn a 4 x 8 foot hardboard panel into the missing covers on my 1130 replica. These will be sprayed with a sand texture paint as a first coat, to get the pebbling effect of IBM machines, then sprayed with as close a spray paint color as I can find to either IBM Classic Blue or IBM Garnet Rose. I also need some medium tone gray to match the top of the 1130. 

Sunday, July 17, 2016

Core reading is working in virtual 1442 device

SAC INTERFACE FOR ADDING PERIPHERALS TO THE 1130

Restructuring the GUI

-- virtual 1442 --

I made a change to the logic that simulates the emission of IL0 interrupts for each card column of the 1442. I had written the state machine to reflect the timing of a 1442 - 20+ ms wait from starting a read or punch until the first column is under the photocells and then a column arrives every 1.125 ms causing an interrupt. The remaining time, 40 ms as the card comes to a stop in the next station, is modeled before it signals completion and drops the busy status.

My prior logic waited for the program to issue the XIO Read to pick up the results of a card column before it moved on with the delay and emission of the next column interrupt. A real 1442 does not do this - it fires off column interrupts every 1.125ms whether the CPU is able to keep up with XIO Read or not. I now simulate this exact behavior, same for punch except the interrupts are once per 6.125ms in that case.

No difference in the behavior after this change. I am going to set up some logic to tell me if the buffer is loaded with the duplicated character or if it is a phenomenon of access during the XIO Read. When I tried to test, I had multiple failures that are hard to understand given the small changes I made. Some look to be hardware problems in the 1130 itself or the SAC interface.

For example, when I try to load core from a file, I get errors where bit 4 is not writing when it should be. It writes just fine when I toggle it in from the console, so it may be an issue with my SAC Interface driver for that bit. Why it would act up now is a mystery.

I attempted a test of the 1442 but encountered timeouts in the transactional engine. This sets me back as I now have to redo all the diagnostics to a very basic level until I find the cause of the stall, then go back to working on the 1442 device behavior.

Well, when I had it all set up with the new low level diagnostics, I powered up and the problems went away! Loading core works well, all 16 bits. The transactions don't time out at all. I attribute this to one of those Xilinx things, where the untrustworthy toolchain produces broken output and it all goes away with a re-synthesis even with trivial changes.

I can turn back to work on the off by one issues, in main core and as the data is copied over from pre-read to pre-punch buffers and then extracted to the stacker files.

My traps to record whether I had a 0 or 1 bit in row 12 of card column 2 failed - since I had a pair, one triggering for a value of 0 and one triggering for a value of 1, it means that the triggering conditions didn't occur. I looked at the case where the Remitter is running, putting out interrupts on IL0 and seeking XIO reads from the software, when the buffer address is 2.

I am not sure how this can fail, since I am loading the buffer even if it might be off by one. I did notice that the sensitivity list for the process had the wrong signals listed, so that might have given me a nonfunctional process for one of the traps. I changed things and tried again.

The traps are still not working, but my fixes have corrected the off-by-one problem for reading into core. I still have a problem with the data fetched from the pre-punch buffer, but that could be a failure in several places:

  • the copy function between pre-read and pre-punch buffers might be shifting the data
  • the get special data transaction may be sending the data incorrectly
  • the GUI program may be handling the returned data wrong before writing to the file
I will shoot the problem by instrumenting the GUI to show me exactly what is returned from the transaction. This will isolate the problem to either the GUI or the FPGA. I set up the diagnostic messages and ran the system again. 

The data is coming up shifted over by one - the first word is always a space, then card column 1 shows up in word 2, etc. This puts the problem down in the fpga, but it could be in one of two places - the copyover process or the get special data transaction process. I will look over both bits of logic to see if I spot anything, plus look at what diagnostic LEDs will help me track this down.

I tried a few changes and ran the tests again, but had the same issue. It is hard to see how the copyover process can get this wrong, since it uses a single index to address both buffers simultaneously and does the write operation to pre-punch several cycles after the address has stabilized. 

If it is the get special data transaction that is malfunctioning, I can't see where it is going wrong. The code that loads up the outgoing data words is a series of if statements driven by the counter which is stepped from 0 to 9 for each transaction and the address to read the buffer is stepped from its start value in concert with the counter. If the problem was here, I would expect to lose every tenth character, not to see the entire card image shifted over by 1. 

I need some inspiration for how to track this one down. It is late so I will stop testing for the day, which has been generally successful. Now that the off-by-one error is gone, reading works solidly. Remaining known issues:
  • data coming out of the stacker is shifted over, so that we have only cols 1-79 with an initial blankcolumn
  • When we read the final card with last card selected, the status is not returned along with the operation complete. This needs a small change to both GUI and FPGA to make this happen
I also need to test:
  • feed only cycles - should copy card contents from hopper to the stacker
  • punch operations of a whole card
  • punch of a short card
  • punch to a non-blank card
  • hopper emptying without last card selected