Saturday, April 5, 2025

Running IBM disk diagnostic 309 against the Virtual 2315 Cartridge Facility - part 5

STILL INVESTIGATING FAILURE TO STAY SYNCED DURING SEEK OPERATIONS

During my testing of the Virtual 2315 Cartridge Facility (V2315CF) I discovered errors in operation when running the disk function diagnostic (309) while operating in real mode. That is when the 2310 (13SD) internal disk drive in the IBM 1130 is spinning a physical cartridge to provide some of the signals for operation while the V2315CF provides the read and write data streams from a virtual 2315 cartridge. 

The issue I experienced was mismatch between the cylinder that the software expected the disk arm to have reached and the data being read. Each sector on a 2315 disk cartridge holds up to 321 words, with the first word traditionally holding the relative sector number. Each cylinder has eight sectors, thus a cartridge with 203 cylinders holds relative sectors 0 to 1623. The software read the first word of a sector and found that it was NOT for the cylinder expected.

I set up some direct instructions on the 1130 to seek specific amounts, as well as instructions to read one word of a sector and to check the status of the disk drive. I used that to try various movements of the disk arm. On the IBM 1130, a seek (arm movement) moves a relative number of cylinders and a direction, either forward to higher cylinders or reverse back towards 0. 

I found that the data returned by the read would sometimes match where I had asked to move the arm and sometimes not match. The disk drive provides a signal, Home, that indicate that the arm is at cylinder 0. When it is at 0, reverse movements are not attempted. However, if the V2315CF is at some higher cylinder, there is no way to back up because reverse seeks won't take place. 

One example was a long seek, from the Home cylinder moving 202 forward, then moving 201 backwards. That should leave the arm at cylinder 1, with Home not turned on, and the relative sector number of the sector read would be 8. However, the Home switch was on indicating that the disk drive movement wasn't correct and the data read was off by multiple cylinders. 

I made changes to force the cylinder used by V2315CF to zero if the Home switch of the drive is on, but that doesn't keep the V2315CF and 2310 in sync for other movements. I also made changes to track the interlocked signals between the 2310 and the IBM 1130 disk controller logic, but I still experienced mismatches. 

REPORTING CURRENT CYLINDER ON DEBUG OUTPUT OF V2315CF

The V2315CF sends debug information out on a USB serial link, which I modified so that periodically it reports the current cylinder that it believes the arm has reached. That will help me compare what my instructions executed on the IBM 1130 produce in terms of V2315CF cylinder location. This required a new version of the PICO software. See my prior post - the toolchain is on strike so this is pending.

WATCHING THE 2310 ACTUATOR POSITION DIRECTLY

With the top cover removed from the 2310 disk drive, I can view the scale on the side of the disk actuator which points to the current cylinder location of the arm. That will help me compare what the instructions executed on the 1130 intended with the achieved location of the arm. I need to know what is misbehaving - 2310, 1130 disk controller logic, V2315CF or some combination - before I can formulate a correction plan. 

RESULTS OF THE OBSERVED MOVEMENTS VERSUS MY INTENTIONS

The actual movement of the disk arm exactly matched the positions I intended. Given the 2310 and the IBM 1130 were in sync, I did a few more tweaks to the FPGA logic for the V2315CF. I finally got code that seemed to always stay in sync and be correct. I tried many variations of seeks followed by reads which were all correct. 

I hand entered a program to repeatedly seek in and out by a configured number of cylinders, letting the disk move the arm as fast as it could back and forth. This seemed to also stay in sync with the V2315CF. This raised my hopes that the problems were solved.

I LOADED THE 309 DIAGNOSTIC AND RAN IT AGAIN - STILL ISSUES

In spite of the correct operation for all the seek patterns I threw at the system, the diagnostic still failed the seek testing. The diagnostic does a seek from what it records as its current position to a target cylinder, then reads a sector from that location to check the relative sector number in word 1. 

The first error reported that in an attempt to seek from cylinder 0 to cylinder 199, the sector read proved the arm was at cylinder 196. The code then kept trying to get to 199, with a succession of failures. From 196, where it believed it was to 199, but the sector showed it reached 200. From 200 back one cylinder to 199 but the sector read showed it was at 198. From 198 up one cylinder to get to 199 but the sector proved it was at 200. 

I am particularly confused by the pattern that occurs after the first seek comes up short. Once it is within one cylinder and overshooting, the V2315CF would have to be doing something really bizarre to act that way. There is a signal line 10-20-mil that controls whether the drive does a single or a double step for each seek. The 1130 controller logic will attempt a 10 mil step if the count to move is odd, followed by 20 mil steps until the total movement count is achieved. 

With the one cylinder movements, the only seek will be a 10 mil step, yet the V2315CF seems to be seeing these as 20 mil steps. The initial long seek seems to be counted as an even number of steps, yet the program should have requested 199 steps thus generating an initial 10 mil step followed by 99 steps of 20 mil. For it to believe it got to 196 means it missed one 10 mil and one 20 mil step entirely but saw 98 correctly. 

The diagnostic 309 works correctly on an IBM 1130 simulator thus it is unlikely that the code is incorrect when running on the real machine. I am assuming that the 2310 disk is going to the intended cylinder properly, but I can't verify it while the diagnostic is trying to move the arm to 199 because the action is too quick for me to spot. 

New Power Distribution Board built; battles with toolchains to build Pico code

NEW PCBS ARRIVED AND PARTS INSTALLED

The latest design of the Virtual 2315 Cartridge Facility's Power Distribution Board is now ready to use. The PCBs arrived yesterday and I built two of them today. This board handles the routing of power from battery, IBM 1130 main power supply, ride-through time delay relay, battery charger/maintainer out to the power supply board for the V2315CF. 

It also provides for detection of loss of 1130 power, triggering the V2315CF to store the contents of a virtual cartridge back on the microSD card media. This preserves any data written to a virtual cartridge whenever an orderly unload is not completed before power is lot. 

RASPBERRY PI PICO TOOLCHAIN CONTINUES TO CAUSE NEEDLESS PAIN AND DELAY

The V2315CF main unit has an FPGA and a Raspberry Pi Pico which cooperate to provide the functionality. I use a couple of toolchains for the FPGA, Vivado and IceCube2, but need a way to compile the C code for the Pico. Putatively the straightforward way to compile the code is to use Visual Studio Code with some extensions provided by the Raspberry Pi people. 

I have slammed my head, figuratively, into a wall trying to get this Visual Studio route to work for me. Since I was developing another project using Raspberry Pi 5 units, which run under Debian Linux, I was able to use an alternate process to compile the code under Raspbian (Debian for Pi). 

Once I completed the other project, I moved the Raspberry Pi 5 unit to my workshop. The design I created on the Pi 5 included a locked down networking environment to work only at the Museum where the units are installed. This means that the unit does not easily connect to the Internet, plus I don't have internet access at the workshop. I can use my iPhone to create a wifi network in the shop, but the Pi still would need to have all its networking hardcoded configurations removed to make use of the connection. 

All I wanted to do was to compile changes to the Pico code, using the Pi 5 which had done this successfully for weeks before. However, suddenly the toolchain decided it needed a new version of a bit of code (Picotool) and attempted in vain to pull it down from Github. I can't get it to compile any more because it is obsessed with this sudden need to update the tool. 

Whether I fight my way to success with Windows and Visual Studio Code, or I fight my way to success unraveling all the networking configuration changes of the Pi 5, it will mean wasted time and annoyance. Toolchains - the bane of my existence. Imagine if a screwdriver or a hammer were to suddenly refuse to work. Documentation, guides and how-tos often fail to help because the toolchain code in question is frequently morphing, changing from the version used when writing the guides. 

Not a huge headache, but more inefficiency and waste of my time that could have been invested in developing new projects or refining existing ones. 

First of two new gears for 2501 card reader arrived

RESTORING 2501 CARD READER FOR VCF - INFOAGE MUSEUM

The Vintage Computer Federation sent a 2501 card reader for restoration. It attaches to the IBM 1130 system I recently restored for them, in addition to the 1132 printer that I am working on. The intent of the restoration is for them to have a fully working system they can demonstrate at the museum. 

The most significant issue I discovered on initial inspection was the disintegration of two gears inside the machine - a failure mode seen by other museums owning these readers. These two gears were constructed of plastic attached to a metal disk but the material degraded to become so brittle that it crumbles to the touch some sixty years after manufacture. 

I had replacement gears 3D printed by an outside service (CraftCloud), using automotive part grade materials and printing methods. The first of the two gears arrived and I inspected it. Oddly, the part did not match the STL file I sent in regards to the height nor the recessed area that would fit around the aluminum disk to which it attaches.


The part I received is symmetric - no recess on either side - and not the full 12mm height of the STL file. 


CraftCloud is a broker that connects requests for printing with various providers who do the actual printing, so I don't believe this is representative of all CraftCloud work, just the particular printing service. The other gear, which I am still waiting to receive, was manufactured by a different provider; I expect it to fully match the design file I sent, unlike the first gear. 

I was able to press the gear on the aluminum disk as you can see above. After I clean up the disk and glue it on, it will turn on the shaft. It should overlap the lip of the disk rather than sitting fully on one side, but after a test fit I believe I can work with this gear as it is.

The test fit above shows how the gear plus disk would fit. I turned the mechanism and found that it worked properly. This gear turns an eccentric cam which moves a 'joggle' plate side to side inside the output card stacker. 

I will put a washer between the aluminum disk and the hub it attaches to, moving the gear outward just enough to fully match the metal gear driving the plastic one. I am cautiously optimistic that this will restore the output stacker operation properly. 

The other gear drives the picker knife that selects one card at the bottom of the input hopper and slides it into the machine towards the first card station. That gear will have more force applied to it and must fit perfectly on the hub that spins it. A version of the design was made by a museum and appears to work properly, although they are still in the midst of their 2501 restoration. 


Friday, April 4, 2025

Running IBM disk diagnostic 309 against the Virtual 2315 Cartridge Facility - part 4

SIMULATION USED TO CHECK OUT THE NEW LOGIC FOR SEEKS

I carefully simulated seek behavior with the significantly changed code for the seek function in the Virtual 2315 Cartridge Facility (V2315CF). This runs under the Vivado design suite, although the actual generation of the code for the FPGA is done with the IceCube2 suite. 

It seemed to work well, but then again the prior version of the code seemed to work properly. In the real world, the signals come from the 1130 disk controller to the Virtual 2315 Cartridge Facility (V2315CF) and are passed out to the 2310 disk drive inside the IBM 1130. The controller and the drive appear to be in sync at all times, but the V2315CF which is just trying to shadow or observe the seeks ends up with the wrong cylinder number quite often. 

FIXES APPLIED TO V2315CF

My new version of the FPGA logic was loaded into the V2315CF. Up came the 1130 system and the virtual cartridge in real mode. I used my manual instructions once again to see if it does better in maintaining solid synchronization with the 2310 disk drive. 

Alas, I saw similarly bad behavior this time, in spite of an entirely different approach to shadowing the disk arm movements. For example, I performed a forward seek to cylinder 202, by setting the seek value to xC9 which is 201 decimal. When I did a read of the sector, I did see the proper relative sector number in the first word of the sector. 

However, when I tried to back up by 200, which would put me at cylinder 1, the Home indicator came on. I was either not fully at 202 with the forward seek or moved further than 200 in reverse. I tried some small seek values and saw similar mistracking.

It has been my belief that the physical 2310 and the 1130 disk controller were in sync, but I need to test that directly. If I take the top cover off of the 2310 disk drive, I can see the marked cylinder locations on the side of the disk arm actuator. It will be better if I can be completely certain that the drive and the disk controller are correct before I keep mucking with the logic in the V2315CF. 

Next I need to make the PICO debugging output list the cylinder number, allowing me to quickly see where the V2315CF believes we are and compare that to the arm actuator scale and to the intent of the instructions I execute. 

Running IBM disk diagnostic 309 against the Virtual 2315 Cartridge Facility - part 3

DEBUGGING TO FIND THE ISSUE

I looked for asymmetry - places where the logic in the IBM 1130 disk controller or Virtual 2315 Cartridge Facility (V2315CF) do things differently for seeks in the reverse versus forward direction, since the symptom appears to be that a reverse seek doesn't go as far as was requested while the forward ones 'appear' to work correctly based on very limited data points.

The disk controller in the 1130 acts symmetrically. When a program issues an XIO (eXecute Input Output) instruction pointing at an Input Output Control Command (IOCC) of type control, it is requesting a seek. The first word of the IOCC has a count of the number of cylinders to move from the current point. The second word has the code for XIO Control, the device address of the internal disk, and bit 13 which is reverse direction if 1, forward direction if 0. 

The disk controller logic picks up the count from the first word of the IOCC and stores its twos complement in a count register. It then sets the direction of arm movement based on word 2 bit 13 of the IOCC and begins repeatedly triggering the Access Go signal to the disk drive. Each time it issues Access Go, it increments the count register. When the register gets to zero, the seek function ends. 

The only fine point to modify the above is that the disk drive can move either 10 mils or 20 mils at a time, a step of 1 or 2 cylinders. The controller logic uses the 20 mil setting for every Access Go except the first one if the count of cylinders to move is an odd number. The increment of the count register is 1 or 2 depending on the step size being used. 

Nothing differs based on direction in the logic for selecting the step size, triggering each move with Access go, incrementing the count register and ending the seek. The direction signal is asserted based on bit 13 of the second IOCC word and shouldn't change during the seek operation. Unless there is some weird flaky component in the disk controller, therefore, it shouldn't cause the symptoms seen.

I reproduced the anomalous reverse seek stopping point by issuing XIO Control directly to the drive and saw similar behavior. This seems to exclude the disk function diagnostic 309 code. 

My logic in the V2315CF is minimally different for reverse versus forward seeks - the code that stops the arm from moving past its cylinder extremes of 0 and 202 has to check different limits. I don't see how the code could malfunction but it is one of the open areas.

The V2315CF looks for the leading edge of the Access Go signal to perform a seek. I have a chain of four flipflops to deal with possible metastability but don't do any explicit debouncing. The same signal that comes to the V2315CF is passed along to the 2310 disk drive, which also looks for the rising edge to start a movement of the arm. 

It appears from my scrutiny of the drive controller logic that the Access Go signal is blocked when the microswitch is activated by the arm sitting at cylinder 0 (Home). Thus the number of actual movements in the reverse direction will be less than the count from word 1 of the IOCC, if the arm reaches Home before completion of the count. 

This shouldn't impact the V2315CF because we get the Access Go signals exactly the same as the 2310 drive does, from the disk controller logic. A possible source of error is if the disk drive reports Home to stop the movement but the V2315CF believes it is at a higher cylinder number. It will then use the residual higher cylinder number to access RAM for when a XIO Read is done. 

I did plan to make a change to the V2315CF to look at the Home signal from the disk drive. Any time it goes to Home, the cylinder in the V2315CF will be reset to 0. This will sync the V2315CF and the 2310 every time we go to the Home cylinder. However, if the two are not tracking perfectly, the emulation will still not work correctly since the data read and written from the virtual disk cartridge will not match the intended cylinder address except surrounding times it goes to Home. 

To collect more data on the (mis)behavior, I set up the machine to manually issue some seeks and reads. I will move the arm around, in both directions, doing a read of head 0 sector 0 from the resulting location to observe the relative sector number which is written in the first word of the sector. By doing a mix of short, medium and long seeks, including some that attempt to move past Home or cylinder 202. 

The most important thing I detected was that this was not an asymmetric failure, occurring only with reverse seeks. For example, starting with the arm at the Home position and the V2315CF in sync with the 2310 disk, I requested a seek of 32 cylinders forward, then did a read. The first word of the sector told me we had reached cylinder 37, four too far as we should have been at 3... Following this with a reverse seek of 32 cylinders, I found the arm back at the Home cylinder according to the disk drive as reflected in the device status word. A read of the sector confirmed this.

I started at Home, performed a forward seek of 201 cylinders so that we would reach the last cylinder on the disk. A read confirmed we were at 202. If the V2315CF saw excess seeks it would still have stopped at 202 so this didn't tell me we performed exactly the correct number. I then executed a reverse seek of 201 cylinders, resulting in the 2310 disk recording our arm back at Home. However, reading the sector showed me at cylinder 16 as far as the V2315CF was concerned. This matched the asymmetric examples I saw earlier.

Now that the disk drive reported the Home cylinder, it was not possible to get the V2315CF to back up any further than cylinder 16. That is because the 1130 disk controller logic, seeing the Home cylinder, does not perform the reverse seeks I requested. 

For some reason, the V2315CF is not tracking the seeks being performed correctly. We are either dropping or adding movements compared to what the disk drive sees and performs. Given the relatively slow logic in the IBM 1130 and the disk drive, what IBM terms the 30ns medium Solid Logic Technology (SLT) family, short glitches from the controller to the disk might not result in any disk movement but we picked up by my logic. Alternatively, I may be picking up noise from adjacent signal lines that trigger the V2315CF. 

CORRECTION OF THE FLAW

I determined that I can make make use of the interlocked behavior of the 2310 disk drive to more faithfully track disk seeks. Each time the disk drive is commanded to move one or two cylinders, it toggles the Access Ready signal. More specifically, the sequence of signal actions is:
  • Access Go is raised to request a disk seek
  • several milliseconds pass
  • Access Ready is dropped
  • Access Go is dropped by the disk controller
  • perhaps 10 milliseconds elapse
  • Access Ready is raised
Notice the interlocked behavior, with Access Ready dropping to indicate receipt of the seek, Access Go dropped once the drive confirmed receipt, then Access Ready raised to confirm completion of the movement. I can make use of this in the V2315CF logic so that bouncing signals wont trigger multiple seek simulations. 

Tuesday, April 1, 2025

Running IBM disk diagnostic 309 against the Virtual 2315 Cartridge Facility - part 2

RESULT OF DEBUGGING SESSION ON SEEK ERRORS

My first test was performed by setting up manual XIO commands to seek forward, seek in reverse, read 1 word of a sector and to sense the device status word. I used them to manually command the Virtual 2315 Cartridge Facility (V2315CF) and the attached internal 13SD (2310) disk drive in the IBM 1130.

I had set up the two seeks to move x00C8 cylinders, one forward and the other in reverse. That is a move of 200 cylinders, thus stopping on cylinder 200. The home bit (cylinder 0) was on in the device status before the seek and off afterwards. 

I then did a seek in reverse of 200 cylinders which should have returned us to cylinder 0. The home bit should be on in the DSW and when I read a sector, the first word should be 0x0000 to 0x0007. 

When it finished, the home bit was NOT on. The read displayed a relative sector number of 0x0020 (32) which corresponds to cylinder 4, not cylinder 0. I did another reverse seek of x00C8 but the value read from the sector showed we had only moved to cylinder 1, not to zero. It took a third reverse seek to light up the home bit and to read 0000 as the relative sector. 

This proved that the diagnostic seek to 0 should have been successful, but somehow it read sectors 112 to 119 instead. Similarly to the experiment with the manual XIO commands I set up, the reverse movement is not moving far enough. 

This rules out the diagnostic as the source of the errors. I may be the IBM 1130 disk controller logic, the V2315CF, the 2310 disk drive or some kind of signal integrity problem on the cabling between 1130, V2315CF and 2310. 

The home cylinder status is passed directly from the 2310 to the 1130, thus we are really not backing up as far as the command requests. The V2315CF decision on what cylinder we reached begins with 0 at startup of the machine and then tracks purely by the seek commands from the 1130 controller logic to the 2310 disk. 

I did update the logic so that when the home condition turns on from the 2310 disk drive, we immediately force the V2315CF to set cylinder to zero. This will keep them synchronized whenever the drive goes to cylinder 0. I don't believe this is the cause of the error, but it was something that made sense as I thought about it. 

The data returned by a Read command is based on the cylinder that the V2315CF believes we are at. Thus if the seek logic in V2315CF is not working correctly, it might get out of sync with the cylinder where the arm of the 2310 is flying over. 

I am busy tomorrow but should be back at the workshop on Thursday where I can load the new bitstream to the FPGA and test again. 

Feeling a bit more confident in the IBM 1130 core memory

CE STORAGE DISPLAY FUNCTION OF THE MACHINE

The machine has a few switches intended to be used by the Customer Engineer who services the system. One of them is Storage Display. When turned on, pressing Prog Start has the machine loop continuously through memory reading each word. If there were a parity error it would stop the scan, otherwise it just runs forever until you press Imm Stop to turn off the run flag. 

SHOOK MACHINE AND THUMPED CORE MEMORY COMPARTMENT WHILE SCANNING

While running the Storage Display, I jerked the machine around on its casters and even thumped atop the gate and compartment holding the core memory. No parity errors were detected which comforts me. I had feared there were intermittent connections, similar to the few I have already resolved, that would show up as new parity issues.