Monday, May 17, 2021

Recovering "lost" treasure-filled floppy discs with an oscilloscope

There are many good, modern solutions for reading data off old floppy discs and drives. Perhaps the best is the Greaseweazle: it's capable, open source, open hardware, inexpensive and has a vibrant and friendly community behind it. It connects directly to a floppy drive, replacing the floppy disc controller, and reads the disc in great detail. It can handle regular discs or any known copy protection without really breaking a sweat.

But what happens when the Greaseweazle reports data that is heavily corrupted and unreadable? Are we out of luck? What if the unreadable disc contains some historic treasure, such as source code for an iconic game? Do we have to shed a tear and move on?

I recently found myself, along with Phil Pemberton, in this situation. Given the unique nature of source code discs, and the potential for historical interest, we were determined to succeed...

A Greaseweazle F7 Plus. A very capable disc reading device, but it is limited by the information the disc drive sends to it.


How do discs and drives work anyway?

Back in the 1980s, your electronic devices often had outstanding user manuals and maintenance manuals, and this definitely applied to floppy disc drives. Aside from enabling you to service the devices you owned, you often got great information on how the device worked. Here is an extremely informative page from the TEC FB-50x technical manual:

This one page concisely describes the chain of electronic transforms from the disc drive magnetic read head, through to the READ DATA output pin that the disc drive offers to the Greaseweazle. As can be seen, the READ DATA pulses (digital pulses, but with analog timing between them) are a best guess approximation at where the analog waveform peaks exist in the voltages coming off the read head.

So conceptually, a disc drive's read operation is fairly simple. Magnetic flux reversals on the disc surface generate peaks in an analog voltage waveform. The drive outputs pulses corresponding to these peaks. It's up to the floppy disc controller to make sense of the timings between the pulses and turn these into data bits. There are several different encoding schemes used for turning pulse timings into data (and visa versa), such as MFM (used in PCs, Amiga, BBC Micro ADFS) and GCR (used in the Apple II and Commodore 64). The BBC Micro DFS discs we're dealing with have the simplest encoding of them all, FM.

The drive doesn't care at all about the encoding, as long as the time between the pulses is calibrated such that the magnetic reversals can be stored reliably on the physical disc surface.


The Greaseweazle view of some "unreadable" discs

Phil and I had the honor of trying to recover the source code to the iconic BBC Micro game Repton 3. We also received the source code discs for other games by the same author (Matthew Atkinson), including Tempest, The Living Daylights and U.I.M.

The recovery was not entirely smooth. Some of the discs had patches of moderate damage and a few of the discs had patches of heavy damage. Let's have a look at two of those, as seen by the Greaseweazle and rendered by HxC Floppy Emulator Software:

Track 22 of one of the Repton 3 source code discs.

Looking at the first disc above, we've got problems in the middle of one of the sectors on track 22 of the disc. The vertical green bars are the sectors. Unsurprisingly, the red one has the problem. Red means that there's a CRC error: the on-disc CRC doesn't match the calculated expected CRC.

Have a look at the two horizontal blank bands. These are actually "dot clouds" representing timings between pulses coming back from the disc drive. The tighter these bands are, the clearer the signal and the better condition the disc surface is in. The black bands above are ok -- not too ragged -- until they degenerate to almost random noise at the point of the error. This is a serious read failure because the faulty timings are all over the place. We can re-read this disc all we want, but the Greaseweazle will only see noise. Any data in that faulty region is lost to the Greaseweazle.

Out of curiosity, what is wrong with this disc?? Well, a quick inspection of the disc itself reveals the culprit: a dent!

This dent affects around 10 tracks, some severely.

A disc's surface is made of pretty robust and pliable plastic, so we did carefully flatten the dent out without scraping off any disc surface. Unfortunately, this did not improve the signal quality. One theory is that the disc was written after the dent was formed.

Track 0 of one of The Living Daylights source code discs.

Looking at the second disc above, we see this poor disc has suffered a terrible corruption! The first two physical sectors on track 0 are comprised largely of noise. This is one of the worst sector read attempts you are likely to see. An unformatted track looks worse, but not by a whole lot.

What is particularly curious is that the sector headers leading up to the sector data are perfectly intact. Before the grey noise clouds on the white background, you'll see a thin vertical strip of green. That's the sector header and they're 100% fine. So, we're not looking at a contiguous patch of disc damage. Instead, the write of the sector bodies appears to have written noise.  The question of "what happened here?" is definitely interesting so we'll revisit it later.


The analog hookup

So, we're dealing with discs that are sufficiently faulty as to be unreadable to a Greaseweazle. Is the data lost forever? To find out, we need to look at what's actually on the disc surface (as opposed to what the drive sends to the Greaseweazle). This image shows the setup we used to investigate:

My TEC FB-502 drive connected to a Greaseweazle and also connected to my Siglent SDS 1104X-E scope.

In this setup, we're still using the Greaseweazle, but only for control and not data. We use it to seek and to spin up the drive motor. The red, yellow and blue wires are connected to three of the drive's test points. The blue wire is the index signal, which pulses once per revolution of the disc. We use this to trigger the oscilloscope capture. The red and yellow wires correspond to the TP3 and TP4 test points referenced above in the diagram from the TEC technical manual. These are the post-amplified voltages coming off the disc drive read head and are the inverse of one another. The trick is to subtract one from the other to eliminate noise common to both signals.

The oscilloscope is a Siglent SDS 1104X-E. It's considered entry level but fortunately, modern tech is powerful compared to 1980s tech. It can sample an entire disc track (200ms of time) to its internal memory at a sample rate of 25Msamples/s, whereas the on-disc bit rate is 250kbit/s. It has voltage sensitivity down to 500uV/div, which dwarfs the a typical signal strength of 400mV peak-to-peak.

Looking on the scope screen, the top half represents an analog capture of an entire track. There's not much to see other than a couple of cyan blips -- those are the index pulses denoting the start of the track revolution. The bottom half is a zoomed-in view and shows the analog view of actual stored bits on the disc surface. To the left, there is 4us between each peak, and on the right, 8us between each peak. We'll get into the encoding later, but a couple of 4us peaks typically represents a "1" data bit and an 8us peak a "0" data bit.

[For another journey on analog-level disc reading, we recommend this FloppyControlApp blog post]


Different drives, different results

One thing we found interesting is that different disc drives have different behaviors at the analog level. These differences won't affect discs in good condition, but definitely lead to different behavior when trying to read faulty discs.

I captured the analog signal from various drives' test points, for a couple of faulty discs. First up, here's another disc with a dent, my Animated Numbers educational title. The manifestation of a dent generally seems to be a significant loss of signal amplitude. Let's have a look at the analog signal captured around the region of the dent:

Mitsubishi MF503 (full res: click here)

At the low amplitudes in the middle of the dent, my MF503 drive provides a noisy, spiky signal. After filtering the noise out in modern software, the signal still isn't clean and differentiating the 0 bits from the 1 bits isn't trivial.

Mitsubishi MF504C (full res: click here)

This slightly less ancient Mitsubishi drive offers a much cleaner analog signal, although quite weak at the worst of the dent: about 5mV peak-to-peak in places. The signal cleans up further with filtering in modern software. The peaks are far too faint for the drive electronics to resolve anything, but as a human, you can discern 4us peaks from 8us peaks. From a quick eyeball, the data appears to be intact.

TEAC FD-55FV (full res: click here)

The TEAC drive signal appears "thick" in the above image. That's the presence of some high frequency noise. It is probably there because some TEAC technical reference manuals show the analog signal test points as being wired before the low pass filter. Applying a low pass filter in software reveals a reasonable looking signal that is slightly stronger than the Mitsubishi drives.

TEC FB-502 (full res: click here)

This TEC drive does very well! The "worst" peaks are clear, clean and reasonably strong at 50mV peak-to-peak. No further software filtering is needed to get clearly resolvable peaks. Also, this drive has a generally stronger signal than the other drives (at 700mV peak-to-peak) for the undamaged sections of the disc.

Let's move on to another dodgy disc to test our ability to recover data at the analog level. This time, we're looking at "Old Macdonald's Farm" by EDIT, an unarchived disc I'm borrowing that is damaged in an identical manner to the "The Living Daylights" source code disc noted above. Here's the analog view of track 0 of that disc.

Track 0 of Old McDonald's Farm.

On top we have a full-track view. It repeats about 75% of the screen width across. The first two sector bodies have a completely collapsed signal, denoted by a magenta flat-line. The strong blip in the middle of the two sector bodies is the sector header, which is fully intact. Only the bodies are toast. On the bottom we have a zoomed-in view to the point where the signal collapses. At first glace, it's a total signal collapse that looks like analog noise.

Let's see what the different drives make of the collapsed area, if we turn up the oscilloscope sensitivity to 10mV/div and zoom in. We'll specifically look at what the first byte of the sector data looks like on various different drives:

In each case, the oscilloscope is capturing at a resolution of 10mV/div, so these are weak signals. The TEC drive has the strongest signal at around 25mV peak-to-peak, with weaker peaks at 10mV peak-to-peak. The other drives are downhill from there. The TEC drive also produces the clearest, least noisy signal. It is the only drive with any reasonable hope of recovering bits from "Old Macdonald's Farm", because the signal gets more chaotic later into the ruined sectors.

For each signal sample, you'll see the bits of the first sector data byte annotated underneath in red (always the same, obviously, and it happens to be the ASCII character '1', the first character of the disc title which is "1187V1.0"). The BBC Micro discs we are dealing with here are single density discs, which use a very simple FM encoding. Inside a sector body, this encoding uses one "clock" bit to keep track of timing for every data bit. The clock bit is always 1, representing a flux transition and therefore a voltage peak every 8us. To store a "0" data bit, there will only be one peak in an 8us window; to store a "1" data bit there will be two peaks.

The signals above are not only faint but also messed up. If you're having trouble discerning why the "1"s and "0"s are labeled the way they are, that's not surprising. Here's what the analog signal looks like when it's normal:

This is "00101100".

The signal is very clear. You'll notice that the peaks all have the same amplitude and shape as well as regular timing. This is from a floppy disc that is well over 30 years old. As long as floppy discs have been stored well, it's staggering how good the data integrity still is!

In terms of disc drives, one of my drives performs head and shoulders above the others: my TEC FB-502. I don't know why, and it's also one of my older drives. It could be that the disc read head is more sensitive, or able to hover closer to the disc surface. Or it could be the first stage amplifier has a far superior gain and signal-to-noise ratio. Or perhaps a mix of the two. The amplifier is a Hitachi HA16331P, which none of my other drives have. I look forward to finding another drive with this chip and comparing. It's also possible that the components (capacitors, perhaps?) have degraded on some of my drives and not the others. There are a lot of possible variables at play here.

The advice when recovering a signal that barely exists, is to try a few drives and see which is best. My TEC FB-502 is clearly a bit of a beast! The "Repton 3" and "The Living Daylights" discs were captured by Phil on his Mitsubishi drive, which is different still to my pair -- with different read heads and electronics -- which was giving good results.

As a fun bonus, Phil's drive also happened to have a variable resistor to control the rotation speed. This enabled us to crank the speed from the standard 300rpm to 400rpm. Why? Because of physics! The voltages induced in the drive read head are supposed to be proportional to the speed of magnetic flux change, and it did help a little with the signal strength.


A quick audio diversion

The astute observer will notice that we're using "Audacity" for viewing the analog disc waveforms -- which is a well-known audio analysis and editing tool. This may seem strange at first, but it's a perfect fit. It enables rapid zooming and exploration of the waveforms, as well as having versatile low-pass filters and the ability to draw directly onto the canvas to fix iffy peaks. It also has a facility to import CSV files, which is one of the formats you might get from an oscilloscope.

And since we're in an audio tool, we can have some fun!

Have you ever heard what a floppy disc recording sounds like? If not, check out this WAV: click me. It's slowed down so that it's in the range of human hearing. A disc revolution is normally 0.2s, as opposed to the 20s or so sample here. Some people say it evokes memories of their old modems connecting. This sample is actually the dented disc and perhaps my favorite part is the dent, where it sounds like a bit of a wub! wub! Could be a good bass sample for a tune :)


Peak recovery

Having wired up our oscilloscopes to our best drives for capturing analog data, it's time to interpret the resulting waveforms. Of course, we ideally want a program to turn the analog waveform into a series of sector data bytes.

One of my attempts simply filtered the data aggressively with a low-pass filter and then looked at the timings between the peaks. Surprisingly (at least to me), this worked terribly. Upon investigation, there were various contributing factors:

  • The signal is very degraded, so there's a baseline of jitter in where the peaks appear vs. where they should be.
  • Applying aggressive filtering that is sufficient to eliminate "false" peaks also shifts the peaks' positions, exaggerating the jitter to the point that some peak timing deltas become indeterminate between 8us and 4us.
  • In the weak signals, the amplitude of the 4us peaks has degraded more than the 8us peaks. Sometimes, the 4us peaks are almost completely gone and filtering doesn't help with the clarity.
The attempt that bore some fruit was to instead locate the start of the sector data, and continually look at the next 8us of the analog stream. If the voltage appears to generally drift in one direction, then it's a "0" bit, or if the voltage peaks one way and then the other, it's a "1" bit. After the decision, a re-sync is performed to the nearest peak. Visually, it looks like this:

A slide from a recent presentation.

As can been seen, the algorithm isn't fazed by false peaks. The 9th bit, a "0" bit, has a very prominent false peak. But the algorithm summarizes that the voltage trend in the 8us chunk is a big downward drift, which has to be a "0" bit.

Despite the chaotic nature of this waveform, the algorithm recovered all of the bytes in these "lost" sectors. To crib another slide from the recent presentation:

The recovered sector data from "The Living Daylights".

As noted in the slide, we compare the recovered sector data against the recovered on-disc CRC16. The sector data is 256 bytes followed by a 2 byte CRC16. We got a match.

In case you were wondering, one thing we glossed over is "locate the start of the sector data". The start of a sector body (or a sector header) contains a pulse sequence that cannot occur in a well formed sector body or header. Essentially, it contains missing clock bits.

As a recap, recovering this data was looking hopeless not that long ago. The Greaseweazle returned mostly noise, but now we've got all the bytes off successfully! This is incredibly encouraging: it suggests that many discs that appear "lost" at first glance are in fact recoverable. Faint traces of data often remain and these can be scooped up with an analog read.

The dented "Repton 3" source code disc was a bit more of a headache because of how incredibly weak the signal becomes in the region of the dent, as seen by Phil's drive. The peak detection algorithm wasn't able to resolve it, although in retrospect, it may have done if we had further filtered the signal. In the end, we resorted to what seems to work best for the worst signals: human intervention! By drawing in repaired peaks on the most degraded parts, the peak detection algorithm was able to resolve the dented region:

Original signal above, hand-drawn fixes applied below.

In the end, with a combination of the above techniques and tools, we were able to recover 100% of the file data from all of the source code discs.

One final note on peak recovery: I know very little about the field of signal processing, so I can't help but suspect there's a "proper" way to recover faint digital signals from analog captures. Advice most welcome.

[This section wouldn't be complete without a link to this very interesting FloppyControlApp Waveform editor. I haven't tried it as we only just found it as we've finished this write up. I wish I had found it sooner! That said, the pulses I've been fixing have often degraded in different ways to the notes and pictures in the video. Perhaps it's a difference due to the difference physical spacings or materials on 5.25" vs. 3.5".]


Caring for discs

No discussion of recovering data from old floppy discs would be complete without some notes on how to best handle the discs. So far, we've looked at hardware, tools and techniques to read data from discs but haven't considered the condition of the discs themselves, or risks to the discs.

To be clear why it matters: spinning some random disc in some random drive can damage the disc, perhaps even irreparably! Possible problems with the discs include:

  • The physical disc surface may have degraded to the point where it is either brittle, or goopy.
  • The disc sleeve lining may have become brittle and scratchy.
  • The disc surface may have become covered in mold.
  • The disc sleeve lining and/or disc may be dented.
And possible problems with drives include:
  • The drive read heads are dirty or scratched.
  • Internal mechanical components bent or out of place.
  • Excessive friction placed on disc surface.

If dealing with rare or unique discs, a minimum level of care is needed:

  • Use a well-tested drive that doesn't mark discs.
  • Stop and ask for advice if the disc or disc surface is visible damaged, moldy or dirty. There's a good discussion on cleaning discs (albeit 3.5") in this video.
  • Inspect and clean the drive heads before inserting each different disc.
  • (Strongly recommended) - use a "sealed / bubble head" drive, such as the FB-502, which places less friction on the disc.

A more thorough document on the preservation of BBC Micro discs may be found here: click me.

And a cautionary note on what can go wrong when trying to read old discs:

Somewhat horrifyingly, part of the disc surface is now transparent. At some point in the past, the information carrying oxide particles were stripped off the plastic disc. Perhaps this could have been avoided by using a cleaner or more gentle disc drive.


What on earth happened to that The Living Daylights disc?

We've now encountered two discs that are corrupted in exactly the same and very curious way: one of "The Living Daylights" source code discs, and the "Old Macdonald's Farm" educational title. In both cases, the re-write of sector bodies has left them totally trashed.

Our initial theory is that the writing drive could have become misaligned. This is a common ailment and it's where the drive read/write head get knocked slightly off where it should be. One effect would be that every write would be off-center of the track, possibly leading to read difficulties in a correctly aligned drive. This theory was eliminated by taking the "Old Macdonald's Farm" disc and manually misaligning a drive in both directions to look for a better data signal; none was obtained. Another flaw with the alignment theory is that a misaligned drive might have trouble reading the sector headers, which would prevent the write from occurring at all.

Our current theory, and one that is a reasonable match for the evidence, is that the drive, cabling or floppy disc controller chip failed in way that prevented write pulses taking effect. This means that any write would energize the drive head but never reverse magnetic polarity. In effect, this runs a constant current through the write head and is a type of erase across the disc sector. So, how thoroughly does this erase remove traces of magnetic flux reversals? I tried this by doing this type of erase on top of a FM bit pattern of "1111000011110000....":

The erased bits have left a trace behind.

Normal signal strength for this disc and drive is 650mV peak-to-peak. The signal strength of the erased bits is about 50mV peak-to-peak for "0" bits, but the higher frequency "1" bits are down to about 20mV peak-to-peak, and are hard to discern. Also, a much higher frequency oscillation has crept in, possible around the frequency of the low-pass filter's cutoff?

Obviously, the result here is going to vary wildly depending upon drive electronics, drive head, and disc material. But the point is made: erasing over floppy disc data leaves a signal behind on some drives.

We think that the sector bodies were erased by a write fault 30+ years ago, leaving a faint signal that then further decayed and became chaotic over the decades. But the data is still there.


Closing remarks

We're very pleased to have recovered some old BBC Micro source code and games discs that were not recoverable by standard means. In the case of the recovered source code, it is now with the original author of the games, who is checking over the materials and deciding whether to publicly release them.

In the case of "Old Macdonald's Farm", we pieced a working disc back together from a combination of bit recovery plus some cross-referencing with a related title that uses the same game engine and a similar disc layout. The results were posted here: click me. It's hard to exaggerate how really faint and corrupted the "Old Macdonald's Farm" disc signal is. It's much worse than the "The Living Daylights" source code disc. We didn't previously look at what Greaseweazle thinks of it, so here it is:

The signal on the trashed sectors is just desolation as far as the Greaseweazle sees. It's a miracle that the bits are still there and recoverable.


We wish to warmly extend the general offer: for unique discs, rare discs, or discs of historical interest, we're happy to receive them and do our best to recover them using the care and procedures outlined above. The offer extends beyond BBC discs to any 5.25" discs: Apple, Commodore, PC, etc.

We don't have any tooling to release at this time, because none of it is close to production quality. This is an area of ongoing effort as and when we encounter new discs to recover. It's also non-trivial to put together an overall packaging of this work, because disc drives and oscilloscopes are all different. Reach out if you'd like to wade in to this area and don't mind dealing with messiness.

There's plenty of opportunity to further research and improve the hardware and software story here:

  • On the hardware side, the general theme is to replace old 1980s tech with modern tech, i.e. the filtering and peak detection has been moved to modern software. Continuing along this theme, it'd be interesting to replace the 1980s amplifier with a high quality modern amplifier with excellent gain and signal to noise ratios. At that point, we could directly compare the quality of different drive read heads.
  • On the software side, as previously covered, I'm pretty convinced my peak recovery efforts could be a lot better!

Thanks for reading!

Chris Evans and Phil Pemberton...

... and Mr. Macdonald



Monday, December 14, 2020

The cleverest floppy disc protection ever? Western Security Ltd.

Introduction

I've been on a bit of a floppy disc protection odyssey recently. This will probably be the last floppy disc related post for some time, so how better to end things than describe a stroke of genius I came across during my research of BBC Micro disc protection?

In my previous posts, we've already covered (directly or tangentially) some interesting floppy disc protection schemes:

  • Weak bits. [link: Weak bits floppy disc protection: an alternate origins story on 8-bit]. "Weak bits" protection is where a patch of disc surface is left empty of magnetic flux transitions. Absent a signal, the floppy disc drive itself will turn up the gain on its amplifier and see noise, which manifests as non-deterministic digital reads from the disc.
  • Fuzzy bits. [link: Technical Documentation - Dungeon Master and Chaos Strikes Back - Detailed analysis of Atari ST Floppy Disks]. "Fuzzy bits" protection is where magnetic flux transitions are recorded at intervals that are out-of-spec with regards to timing expectations (e.g. MFM). By using intervals right in the middle of a pair of expected values (e.g. 5us instead of the expected 4us or 6us), it's possible to cause the disc controller to return non-deterministic digital reads.
  • Long / short tracks. [link: Turning a £400 BBC Micro (1981) into a $40,000 disc writer (1987)] [see the "Capabilities Unlocked" section]. Long track protection is where flux transitions are written a little bit faster than they should be. This could be the whole track or just part of a track. In either case, the track will appear to contain more bytes than a track normally should. This works because the floppy disc controller usually has a broad tolerance of bitrate, to cater for disc drives naturally spinning at different speeds.
Now, we're going to cover something else entirely, with some unique properties that I think are clever.

Western who, now?


It's not readily possible to find any "Western Security, Ltd." company associated with the BBC Micro. So how do we know such a thing exists? Two reasons. Firstly, when developing my beebjit [link] emulator, I made numerous mistakes and had tons of bugs. While grappling with accurate floppy disc emulation -- in order to load original protected disc images -- I hit this:

Jolly Jack Tar by Sherston Software

Of course, the original author here missed out an amusing case. Either the disc is faulty (no), it's an illegal copy (not really?).... or a buggy emulator. Here's another error screen. Perhaps an earlier version of the protection scheme? The message uses "disk" instead of "disc" and does not entertain the fact the disc may be at fault:

Phantom Combat by Doctor Soft

(At the risk of digressing heavily, here's another screenshot. This is unrelated to Western Security Ltd., but follows the pattern of a dedicated screen for when the disc protected loader fails. This case is interesting because there's an entire decent image hidden from normal view!)

Disc Duplicator 3, if it thinks it's a copy

The second reason we know the name Western is from disc duplicator markers. A disc duplicator marker is a special track written by some commercial disc duplication solutions. It is typically one track after the end of the normal disc, i.e. the 41st track or 81st track of the disc. Despite being "out of bounds", most disc drives will seek at least one track out of bounds. On the track, you'll typically find a single sector header, and a sector body with a CRC error. This is an unusual construction and it sticks out, for example when my discbeast program [link] looks at the Jolly Jack Tar disc:

Discbeast's view of Jolly Jack Tar

There's a date and a text string or two embedded in the duplicator marker. The Jolly Jack Tar instance is:

86 07 01 (1986, July 1st)
523-037E        WESTERN NM,10/256 PROT DUP 5"-48/40 1S SD SS
32173-2Aw

There's one other less common duplicator marker string we see associated with Western security, for example Tens and Units by Sherston Software. It appears to be an earlier prototype version:

85 10 31 (1985, October 31st)
523-037C        BBC NM,10/256 WESTERN SEC. PROTO1 DUP 5"-48/40 1
70422-00w

A recap of disc protection fundamentals

Before diving in to how the Western Security protection works, let's recap the fundamentals of disc protection. The specifics are nuanced but the general principle is pleasantly simple:

  • The disc is easy to read but hard to write.
The disc has to be relatively easy to read. If the disc couldn't be read by standard disc controllers, you couldn't sell the disc because customers would not be able to load it!

And the correct way to make a disc hard to write is to carefully understand consumer disc controller chips and disc drives and make the disc impossible to write with standard kit.

The BBC Micro disc protection journey did not follow this route -- the earlier discs, and many of the later discs, were exotic but not particularly hard to create with the standard disc controller. But you did need a special program. The software companies had their special mastering program and the customers did not, so in a sense the disc was "hard" to write. But then, general copying programs such as Disc Duplicator 3 appeared and suddenly the discs were "easy" to write, if you or a friend had the advanced disc copying program.

However, a few BBC Micro pioneers did write discs that "can't" be written with the standard disc controller, most notably:
  • Simon Hosler's weak bits scheme, as linked above. (And here: [link].) [1]
  • Many discs, such as The Sentinel by Firebird, hid data between sectors. [2]
[1] One of a disc controller's main jobs is to write precise disc surfaces, i.e. to never write weak bits. However in the modern era I did find a way to trick the Intel 8271 controller into writing weak bits. Best I know, this technique was not discovered or used back in the day.
[2] With a bit of clever programming, I believe data between sectors could be written with standard disc controllers. It would involve writing to a track multiple times and resetting the disc controller at critical timing junctures. I've not seen any historic disc copying program that attempts to copy such discs, though.

Analyzing the Western Security Ltd. protected loader

This version of The Wizard's Revenge by Sherston Software features Western Security Ltd. disc protection.

Here's a gratuitous screenshot of the game.

And here's what discbeast makes of the disc:

There's nothing too unusual here. The red block at the end denotes a track with 1 sector that has a CRC error. It's not part of the disc protection; it's the duplicator marker. The green color represents a standard sector layout with nothing particularly exotic present. The "D" indicates the presence of deleted sectors. This is disc protection, but not exotic or hard to copy. However, loading the disc in my beebjit emulator, using the Western Digital WD1770 floppy disc controller and -log disc:commands, yields something highly unusual:

info:disc:1770: command $E4 tr 1 sr 3 dr 1 cr $29 ptrk 1 hpos 1884

Command $E4 is "read track". It is not used in the normal operation of disc loading, or indeed in any other known protected loaders.
It's time to walk through the loader a little to see what is going on. The goal of this post is not to give a detailed walk through of all the gnarly corners of the loader. Like many protected loaders, it is "encrypted" or more accurately, obfuscated. It de-obfuscates itself using many layers of self-modification. At PC $0657, a disc call is set up to load a bit more of the loader from deleted sectors in track 9. This could easily be confused as being the extent of the protection, but it is not. Later, at PC $3CF7, there is a loop:

[ITRP] 3CF7: JSR $3D16
[ITRP] 3CFA: LDA $75
[ITRP] 3CFC: SEC
[ITRP] 3CFD: SBC $3F4E
[ITRP] 3D00: TAX
[ITRP] 3D01: LDA $86
[ITRP] 3D03: STA $3F71,X
[ITRP] 3D06: INC $75
[ITRP] 3D08: LDA $3F4F
[ITRP] 3D0B: CMP $75
[ITRP] 3D0D: BCS $3CF7

It's not expected to be clear from this fragment alone (without all the subroutines), but what is going on here is a loop across tracks 1-8 inclusive. For each track, it is read fully and the length of the track is stored in a table at $3F71. Later, at PC $3CB6, these lengths are checked against a table located at $3D9B. The check is particularly interesting. For the copy protection to pass, a count is taken of each track length that falls within +-1 of the expectation, and this count must be 7 or 8. Immediately we see that this protection is not a precise technology. This is to be expected: discs are analog storage mediums underneath it all and there will be a bit of noise at the beginning or end of the track. From read-to-read, the same disc, drive and controller can and does see slight variations of track length. Looking at the load in beebjit, here are the encountered vs. expected lengths. (0x4A == 3122 bytes.)

My captured disc
3F71: 4A 4D 4B 4A 4B 4D 4D 4A
Expectations
3D9B: 4A 4C 4B 4A 4B 4D 4C 4A

As can be seen, my disc reads almost exactly to expectations except track 2, which was 3125 bytes instead of the expected 3124. That's within the +-1 range so all 8 tracks pass the check (7 or more are required).

To spell it out, the protected loader expects the track length (in FM encoded bytes) for tracks 1-8 inclusive to be 3122, 3124, 3123, 3122, 3123, 3124, 3123, 3122. That's an incredible level of precision and accuracy of writing for 1985 technology. Each byte covers about 64us on a spinning disc, but disc drives of the era would often specify running RPM variation of "less than +-1.5%". At 300rpm, +-1.5% is +-3ms per rotation! In my experience, drives are much much better than that but still! How did 1985 technology cope??

One further note, on the WD1770 vs. Intel 8271 disc controllers. We've focused our analysis on the protection as seen by the WD1770. This disc also loads fine with an Intel 8271 controller. It uses a completely different code path to indirectly calculate track length in this case. It works because the padding bytes up to the end of the track are all 0x00 instead of the usual 0xFF. An "over-read" past the final sector on the track is used, and track length is detected by the change from reading 0x00 to 0xFF.

The genius twist

And then we realize the genius of this scheme. What if they didn't write the tracks with extreme precision, because the technology wasn't there? What if they just wrote the track any-old-how and then examined the result? I think this what they did:

  • Format the disc. (That's it. Just format the disc normally and you've created a fiendishly protected disc.)
  • Read the lengths of tracks 1-8 of the disc you just wrote.
  • On a per-disc basis, generate, obfuscate and write the required table of expected track lengths into the loader on track 9.
What a trick! These discs are savagely hard to copy, even with dedicated hardware of the era. In fact, this trick is so clever that it requires us to revisit the fundamentals of disc protection. Here's our earlier statement:

  • The disc is easy to read but hard to write.
But the Western Security Ltd. protection is quite easy to write -- you just format a disc. So perhaps we should say:
  • The disc is easy to read but hard to recreate.
The Western Security protection is easy to write, easy to read but very hard to copy. One of the advantages of such a protection is that you could use a home computer to write it cheaply if the economics worked for you. That said, many discs I've seen appear to have duplicator markers typically found with expensive disc duplicating machines.

How can we support our theory? One way is to find a second disc of the same title, The Wizard's Revenge, and see if it differs at all. I did find one, and it does. My disc as per above:

Track 0 sectors 10 length 3124 fixups 1 CRC32 2E6B86E9
Track 1 sectors 10 length 3122 fixups 0 CRC32 37E30EC8
Track 2 sectors 10 length 3125 fixups 1 CRC32 F7BDE89B
Track 3 sectors 10 length 3123 fixups 0 CRC32 BB1E32C3
Track 4 sectors 10 length 3122 fixups 1 CRC32 2EC84AF1
Track 5 sectors 10 length 3123 fixups 0 CRC32 58FD732B
Track 6 sectors 10 length 3125 fixups 0 CRC32 43416F5A
Track 7 sectors 10 length 3125 fixups 1 CRC32 B3D8AB10
Track 8 sectors 10 length 3122 fixups 1 CRC32 8D02AC32
Track 9 sectors 10 length 3125 fixups 1 CRC32 75B1F57B
Track 10 sectors 10 length 3123 fixups 1 CRC32 D2B0A1EF
[...]

Additional captured disc:

Track 0 sectors 10 length 3126 fixups 1 CRC32 2E6B86E9
Track 1 sectors 10 length 3129 fixups 1 CRC32 37E30EC8
Track 2 sectors 10 length 3127 fixups 1 CRC32 F7BDE89B
Track 3 sectors 10 length 3127 fixups 1 CRC32 BB1E32C3
Track 4 sectors 10 length 3127 fixups 1 CRC32 2EC84AF1
Track 5 sectors 10 length 3129 fixups 1 CRC32 58FD732B
Track 6 sectors 10 length 3129 fixups 1 CRC32 43416F5A
Track 7 sectors 10 length 3128 fixups 1 CRC32 B3D8AB10
Track 8 sectors 10 length 3128 fixups 1 CRC32 8D02AC32
Track 9 sectors 10 length 3127 fixups 1 CRC32 1F5ABD44
Track 10 sectors 10 length 3128 fixups 0 CRC32 D2B0A1EF
[...]

As can be seen, the sector data (covered by the CRC32s) of the discs is the same except for track 9, which is where the table of expected track lengths resides. And the track lengths themselves are different. In both cases there's track length wobble and in the case of the latter discs, the tracks are all a little longer. This means that the drive that formatted the disc was running at a slightly slower RPM, enabling more bytes to be written in a single revolution.

And there's a second way to support the theory. I just asked Simon Hosler, who wrote a lot of Sherston Software games back in the 1980s, if he remembered anything. He recalls,

"Western Security –  this wasn’t my system but I think it worked like this... It works because each disk drive runs at slightly different speeds. So when a drive creates a track on the disk – after it has added all of the headers and sectors etc, it has a bit of space to fill up, so adds a few extra ‘fill in’ bits that do nothing. The number of fill in bits will depend on the exact speed of the original formatting drive. So will be different if it was copied. I remember this system became a pain because disk drives were changing (?) all the time. [...] the system was called 'Fingerprinting'"

There remains the mystery of how a commercial duplicator could handle making these discs. One thing I've come across is references to a "Freeform" script [link], used in conjunction with Trace duplicators. I haven't been able to find a manual for this scripting language. There are some strange fragments in the memory of the loader after de-obfuscation; could it be related?

3F80: 54 70 00 41 44 44 20 20 20 20 20 72 00 4D 4F 56  Tp.ADD     r.MOV
3F90: 45 54 4F 20 20 74 00 55 4E 49 54 20 20 20 20 75  ETO  t.UNIT    u
3FA0: 00 54 52 41 43 4B 20 20 20 76 00 53 45 43 54 4F  .TRACK   v.SECTO
3FB0: 52 20 20 77 00 54 4F 50 54 52 41 43 4B 78 00 55  R  w.TOPTRACKx.U

Recreating Western Security style protected discs

I happen to have a couple of very different disc drives, so I thought I'd try them out. I formatted a disc in each drive, then had a look at the track lengths (in bytes) read back by a WD1772 disc controller in a real BBC Micro model B.

Contender #1: my Chinon F-051MD drive. It is older, 40 track and single-sided only. The pinch-to-close style door mechanism later fell out of fashion (was it less reliable?). Another way to date this is the large DIP-style main chip, a NEC D8048 which is an Intel 8048 clone! (There's a ROM here to extract one day!)

Contender #2: not my Mitsuibish MF504C, but pretty similar. A much newer drive, 80 track and double sided. Notice the newer chips, cleaner wiring and simpler clamp mechanism. Not visible, but it also has a smaller (next gen?) stepper motor.

Track lengths from Chinon F-051MD:

[...]
Track 1 sectors 10 length 3137 fixups 1 CRC32 67F0950E
Track 2 sectors 10 length 3138 fixups 1 CRC32 67F0950E
Track 3 sectors 10 length 3138 fixups 1 CRC32 67F0950E
Track 4 sectors 10 length 3139 fixups 1 CRC32 67F0950E
Track 5 sectors 10 length 3138 fixups 1 CRC32 67F0950E
Track 6 sectors 10 length 3140 fixups 1 CRC32 67F0950E
Track 7 sectors 10 length 3140 fixups 1 CRC32 67F0950E
Track 8 sectors 10 length 3139 fixups 1 CRC32 67F0950E
Track 9 sectors 10 length 3140 fixups 1 CRC32 67F0950E
Track 10 sectors 10 length 3140 fixups 1 CRC32 67F0950E
[...]

Track lengths from Mitsubishi MF504C:

[...]
Track 1 sectors 10 length 3119 fixups 1 CRC32 67F0950E
Track 2 sectors 10 length 3119 fixups 1 CRC32 67F0950E
Track 3 sectors 10 length 3118 fixups 0 CRC32 67F0950E
Track 4 sectors 10 length 3119 fixups 1 CRC32 67F0950E
Track 5 sectors 10 length 3119 fixups 1 CRC32 67F0950E
Track 6 sectors 10 length 3119 fixups 1 CRC32 67F0950E
Track 7 sectors 10 length 3119 fixups 1 CRC32 67F0950E
Track 8 sectors 10 length 3119 fixups 1 CRC32 67F0950E
Track 9 sectors 10 length 3118 fixups 1 CRC32 67F0950E
Track 10 sectors 10 length 3119 fixups 1 CRC32 67F0950E
[...]

These two drives definitely have unique fingerprints! The older drive has a bit more variance / wobble in individual track lengths. In fact, it seems to fit more bytes per track on the later sectors -- most of the latter tracks are length 3142. The older drive also generally fits 20 or so more bytes per track. This means it is spinning a little slower. By contrast, the newer drive runs like clockwork with minimal track-to-track variance. This might make it a little less suitable for generating varied per-disc fingerprints, but copying the disc on home computing hardware would still be tricky -- the machine used would need to have a similarly precise drive, and the drive would also need to be running at the exact same speed as the drive that made the original disc.

One last look at track lengths, this time from a Phantom Combat game disc, by Doctor Soft. This disc uses Western Security Ltd. protection:

Nothing quite says "we overcooked the disc protection" like a warning sticker about compatibility (left, middle). The sticker states that only Acorn disc ROMs are supported. Unfortunately, non-Acorn disc ROMs were extremely common.

[...]
Track 1 sectors 10 length 3152 fixups 0 CRC32 67F0950E
Track 2 sectors 10 length 3152 fixups 1 CRC32 67F0950E
Track 3 sectors 10 length 3152 fixups 0 CRC32 67F0950E
Track 4 sectors 10 length 3152 fixups 1 CRC32 67F0950E
Track 5 sectors 10 length 3151 fixups 0 CRC32 67F0950E
Track 6 sectors 10 length 3152 fixups 0 CRC32 67F0950E
Track 7 sectors 10 length 3150 fixups 0 CRC32 67F0950E
Track 8 sectors 10 length 3151 fixups 0 CRC32 67F0950E
[...]

This is disc protection hidden in plain sight. These 8 protected tracks are in fact 100% default format tracks (CRC32 67F0950E). There's the correct number of sectors and there are no deleted sectors. discbeast would show these tracks as plain green, aka. move on, nothing to see here!
There's a bit of track length wobble from track to track, but the tracks are also unusually long (the ideal is 3125 bytes). It is not clear whether the tracks were deliberately laid down a little long, but it definitely helps the disc protection. Any disc drive used to attempt to re-create these tracks is unlikely to be spinning at that speed because it is 0.8% slow and the disc drive motor technology is generally better than that. It's interesting to note that simple long track protection is just another variant in the Western Security scheme.
This disc was probably written from a BBC Micro itself, rather than a fancy duplication machine, as evidenced by the lack of a duplicator marker and by the fact the game data sectors show evidence of being written sector-at-a-time, not track-at-a-time. Who knows, maybe there was a deliberate attempt to find a slow drive; some older drives even have a twiddleable variable resistor to vary motor speed.

Conclusion


We've reviewed the most powerful BBC Micro model B disc protection scheme I found, across an audit of most of the copy protected discs released for the machine. It's clever in that you don't need specialized hardware to create the disc, or read the disc. But you're going to struggle to duplicate the disc.
My suspicion is that even modern hardware -- a Greaseweazle perhaps -- might struggle to hit byte-perfect precision, unless it is paired with a high quality disc drive that rotates very consistently (such as my Mitsubishi MF504C).
Talking again to Simon, as far as he's aware, this was the work of George Keeling, who I'd love to contact if only to tell him I think this work was awesome.

Monday, November 16, 2020

Reverse engineering a forgotten 1970s Intel dual core beast: 8271, a new ISA

"As I recall, those two chips were fairly large. And fairly late -- to the marketplace. We had lots of issues with them. [...] Sometimes the elegant solution isn't the best solution." -- Dave House, digressing to the 8271 during "Oral History Panel on the Development and Promotion of the Intel 8080 Microprocessor" [link], April 26th 2007, Computer History Museum, Mountain View, California.

Introduction

Around 1977, Intel released a floppy disc controller (FDC) chip called the 8271. This controller isn't particularly well known. It was mainly used in business computers and storage solutions, but its one breakthrough into the consumer space was with the BBC Micro, a UK-centric computer released in 1981.

Intel 8271 (left) in an issue 3 BBC Micro model B

There are very few easily discovered details about this chip online, aside from the useful datasheet. This, combined with increasing observations of strange behavior, make the chip a bit of an enigma. My interest in the chip was piqued when I accidentally triggered a wild test mode that managed to corrupt one of my floppy discs even though the write protect tab was present! You can read about that here:

A wild bug: 1970s Intel 8271 disc chip ate my data!

Can we reverse engineer a detailed understanding of how it works? What wonders will we find?

Credits

The work described here represents the efforts of a virtual team that came together in an impromptu way to investigate the chip. There are many critical players, and special thanks go to:

  • Nigel Barnes. Bridged the MAME community and BBC Micro community, locating someone with chip decapping skills.
  • BeebMaster. Provided a couple of sacrificial 8271 chips for decapping.
  • Sean Riddle. Decapped the chips and provided beautiful hi-resolution images.
  • ZXGuesser, Diminished. Hardware level reverse engineering, including accurate extraction of ROM bits.
  • Ken Shirriff. Provided notes on silicon macro structures, and located historic documents of great utility.
  • Rich Talbot-Watkins, Chris Evans (me). Calculating the ISA, disassembling the ROM.

The beauty of the beast: recap and decap

So to recap: we had a few indicators that this could be a very interesting chip. These range from the crazy test mode we found above, to the fact the data sheet hints at a large number of internal registers, with only a few documented.

And, to decap:

(See references at the end for very high resolution shots)

This a complicated chip for an FDC! There are a large number of large structures present, densely packed. To illustrate the point, we can compare this chip with the heart of the BBC Micro -- a venerable, legendary 6502, as used in iconic 80s machines and consoles such as the Apple ][, Commodore 64 and NES:

8271 left, 6502 right

Not only is the 8271 larger than the main CPU by a significant amount, it also cost more by all accounts:

"In Acorn's wisdom, they had chosen the Intel 8271 disk controller for the BBC Micro - this controller was probably obsolete even before the BBC Micro was launched. The Acorn disk upgrade comprised the 8271, a handful of standard TTL ICs, an Acorn DFS ROM, and the disk manual. The ICs plugged into unpopulated sockets on the motherboard.

A few hardy folks tried sourcing the parts separately - which was fine, except that an 8271 tended to cost about £109 on its own, if you could find one..." -- [source link]

By contrast, the 6502 allegedly cost around $7.45 in 1981, which is a huge difference factor.

Getting the ROM bits

From a quick eyeball of the 8271 die, there are plenty of PLA-like structures, but perhaps most promisingly, a large rectangle of ROM in the lower left. Right away, we're thinking that this could indeed be a general purpose microcontroller CPU if there's a ROM program. Initial focus fell immediately to extracting the ROM.

ZXGuesser and Diminished put in a Herculean effort, transcribing the ROM bits first by hand and later with some tooling assist and cross-checking. In case you're wondering what transcribing ROM bits by hand looks like, it looks a little like this:

Look closely; those cyan annotations are 0s and 1s!

The ROM matrix is 64x108 bit cells, for a ROM size of 864 bytes.

Once you've gotten the ROM bits, you still have the challenge of assembling them together into a correctly sequenced byte stream. This isn't always as easy as you might think, and is particularly rough when you don't have a reliable way of telling if the extracted bytes are ok, as is the case here. During my research, I found:

Fortunately, the ROM reversers also looked at the row and column circuitry connecting the ROM, and gave a fairly robust opinion that the ROM bits / bytes are:

  • Left-to-right, top-to-bottom.
  • Bits inverted relative to the initial decode in the image above.
  • Bits MSB first.
  • Bytes build from 1 bit per linear 8 bit group.

This gives the first bytes of the ROM as the following:


This happens to be a correct decode, although we couldn't be sure for some time.

Architecture hints

As Rich and myself engaged in efforts to try and disassemble the ROM, without any prior knowledge of the Instruction Set Architecture (ISA), I was fortunate enough to have a conversation with Ken Shirriff (righto.com) about this interesting chip. Against all odds, he found a detailed conference presentation abstract from 1977 [link to copy of full abstract].

This image from the abstract, will look familiar (imagine +90 degrees rotated).


The rest of the abstract is a gold mine. The presentation was titled "A Dual Processor Serial Data Controller Chip" and begins:

"A DUAL PROCESSOR microprogrammable chip that implements a specialized architecture for high-speed serial data controllers will be described. The chip measures 218 mils by 244 mils and contains 22,000 transistors..."

22,000 transistors!! We knew this was a bit of a chonker, but for reference, the 6502 has 3,218 transistors; the 8080 6,000 transistors and the 8086 29,000 transistors. Yes, the 8271 is not that far off an 8086 in terms of transistor count.

This instructive table and diagram are also from the abstract:

This confirms our suspicions that we are dealing with a general purpose CPU, coupled with some acceleration for I/O. The general purpose CPU has the features you'd expect, including a PC, stack, ALU, accumulator, registers, and access to a bus. Further detail in the abstract includes "32 eight-bit registers" and "four-level stack".

The presence of a a bit processor (a co-processor if you like) with 250ns cycle time resolves the elephant in the room with the "general CPU" theory. A general CPU of the era probably wouldn't be fast enough to calculate CRC16 at the disc data rate. However, a specialized PLA-driven bit engine bolted on to the side will do just fine.

Finally on this abstract, we see that it states "To date, two distinct controllers have been microprogrammed: a floppy disk controller and a synchronous data link controller (SDLC)". Well, we've found the FDC, the 8271. Intel's SDLC from the era was the 8273. Enter Sean Riddle once more -- what a star -- and another sacrificed chip or two later, we have our result:

8271 left, 8273 right

Ken also found a patent relating to this dual core design! It's US4152761. [link to a more complete version]. There's a lot of useful architectural detail in the patent, including this interesting summary of key components.

From US4,152,761

As can be seen, there's additional complexity described here that falls outside what we'd expect from a traditional CPU. There's dispatcher requests, priority resolution, a "case" and an "address" register in addition to the program counter and instruction register. This relates to some form of scheduling, which we'll encounter later.

Finally, here's a diagram (courtesy of Ken). It's an early estimation of how the architecture we've seen in the the documents so far might map to the silicon. As a dual processor behemoth, note how there's two ALUs, two sets of registers, lots of PLAs (including a couple to the left and right of the "control" label), and plenty of patches of silicon with unknown function.


From bits to bytes to an Instruction Set Architecture (ISA)

With an overview of the architecture, we have a better idea of what sort of opcodes we might find in the instruction set. The more pieces of the puzzle we have in our heads, the better chance we have of making abstract connections in our attempt to solve a problem with a lot of moving parts.

Initial attempts to disassemble the ROM centered around the theory that the CPU core might be based on the Intel MCS-48 microcontroller series. This series includes the well-known 8048 and has several variants such as the slightly cut-down 8020 (less I/O lines). And why wouldn't this core be based on a further cut-down 8020? It just fits too well: 1K ROM, 64 registers, 13 I/O lines. The timing works well too: the MCS-48 series first launched in 1976, making the core available before the release of the 8271. So the theory went, you'd be crazy as an 1970s Intel employee to not just walk down the corridor and raid the parts bin of your colleagues in order to get a headstart.

The MCS-48 theory turned out to fit extremely well.... but be false. No amount of ROM bit / byte wrangling led to sensible MCS-48 disassemblies.

Back to the drawing board, we looked again at our "probably correct" ROM decode and analyzed it for patterns we'd expect to find in an 8271 ROM, based on our understanding of how the controller works, and how disc recording works in general. Here's a section we found enlightening:

Most significant is the appearance of the constants $FE + $C7 (middle box), then $E5 + $FF (lower box) in a the same context. These constants are the data byte + clocks byte we'd expect to see in the format routine for FM (single density) formatted discs. $FE + $C7 is the sector header marker, and $E5 + $FF is the default fill byte for freshly formatted sectors. As a bonus, we speculated that $E9 (appearing four times in a row) could be a shift left or right opcode, with $FD perhaps being RET. All in all, some strong circumstantial evidence for a correct decode.

After the MCS-48 opcode list failed to match our ROM, we spent a long time trying to derive an alternate instruction set.

But it's hard; it's a bit like one of those jigsaw puzzles where every piece is the exact same color. It's very hard to find a start as there are so many different possibilities to try. Research is too often presented as a neatly packaged result, with no mention of the struggle. I think this contributes to discouraging newcomers. So make no mistake: this was a struggle; it went on for some time; there was swearing; and the characteristics leading to success were perseverance and grit as opposed to any particular technical ability.

The breakthrough was catalyzed when the hardware investigations got a good read on the wiring and content of one of the instruction PLAs associated with the byte processor. The PLA in question is actually the darker rectangular block to the left of the "control" label in the image above. It is populated like this:

Thanks to Diminished!

With the instruction PLAs decoded, it would be possible to trace the effects of each 8-bit opcode by following activation lines to the registers, stack, ALU, etc. However, we were able to hit our breakthrough with only opcode ranges, provided by this PLA. Of specific interest is these ranges from Rich:

1111 1100   FC
1111 1101   FD
[...]
0010 xxxx   20-2F
[...]
0011 xxxx   30-3F
[...]
001x xxxx   20-3F

The specialized $FC and $FD match our theory of CALL and RET, but more pivotal is the range of $20-$3F, split into 2x 16 wide blocks. Immediately, I speculated this could be "move register to accumulator" and "move accumulator to register". This would turn out to be correct. Furthermore, I noted that opcode $0? was common before $2? or $3?, e.g.

05D:   04 22 03 35

This led to the theory that these are register bank selection opcodes. Similar to the MCS-48, the theory is that there aren't enough opcodes for all operations to be able to reference all 32 registers, so there must be a bank select. This also would turn out to be correct. In fact, with these pieces, the jigsaw started to fall in place, and fall into place faster and faster.

Some ROM code examples

For these code examples, bear in mind that the Instruction Set Architecture (ISA) presented here hasn't seen public light of day as far as we know. We've had to invent our own assembly mnemonics, although they're designed to be familiar to anyone familiar with assembly languages in general. Some specific notes:

  • SEL RB is SELect Register Bank, and provides the base index for register access (multiply by 8 to get actual index).
  • Note that all register references are by index, not register number. A register number can be calculated with the index and register bank.

1) READ SPECIAL REGISTER

.command_READ_SPECIAL_REGISTER
32E   00 SEL RB 0
32F   27 MOV A, I7 ; R7 ($07) (param 1)
330   30 MOV I0, A ; R0 ($00) = register index
331   F8 MOV A, [I0] ; read special register value to A
332   02 SEL RB 2
333   36 MOV I6, A ; R22 ($16) (ext RESULT) = A
334   0E SEL RB 14
335   EE SYS 2, RB ; JMP (2,E,0) => $288, .post_command_tidy_up

The READ SPECIAL REGISTER command is one of the simplest. As suspected, "special register" externally is just "index into the 32 registers" internally. We can also note plenty of interesting things:

  • The command setup code, which we'll see in a bit, writes parameters to R7 downwards.
  • Indirect reads and writes are done via special opcodes that indirect through I0 (typically but not always R0), as per the MCS-48 architecture.
  • Some of the internal registers interact with the external bus registers used to interact with the host CPU. R22 is where you write values for them to appear in "ext RESULT", which is read with BBC Micro's 6502 at memory location $FE81.
  • The "register bank" concept appears to be re-used to provide a lookup index to the SYS 2, RB opcode. You might consider this a minor kludge.
  • In order to jump from page 3 ($3??) to page 2 ($2??) at the end, a special SYS opcode is required. Normal JMP / CALL instructions only have an 8-bit operand and can only jump to the current page. SYS opcodes look up a 10-bit PC from a hard coded table.
  • The command exits by jumping to a common exit routine, .post_command_tidy_up. This routine disables most of the chip, including the bit processor, and events. This will be important later.

2) SPECIFY

.command_SPECIFY
066   00 SEL RB 0
067   27 MOV A, I7 ; R7 ($07) (param 1)
068   30 MOV I0, A ; R0 ($00) = destination index
069   F1 03 MOV I1, #$03 ; R1 ($01) = count of 3 extra parameters
06B   14 YIELDTO 4 ; seems to set PARAM callback to
; .wakeup_PARAM_4_SPECIFY, then YIELD
; ** entry point (3, segment 9, routine 4 5 6 7 C D E F)
.wakeup_PARAM_4_SPECIFY
06C   03 SEL RB 3
06D   2E MOV A, IE ; R30 ($1E) (ext PARAM)
06E   00 SEL RB 0
06F   E8 MOV [I0], A
070   00 SEL RB 0
071   80 INC I0
072   A1 DEC I1 ; R1 ($01), decrement count
073   8A 8C BZ $08C                 ; branch if done
; lands at SEL RB 14, JMP RB
; which jumps to .post_command_tidy_up
075   FF YIELD

SPECIFY is also a simple command. It's not strictly a necessary command because it has the same effect as 3 WRITE SPECIAL REGISTER commands to sequential register numbers. It's unclear why it exists, given the space crunch in the ROM and on the silicon, but the datasheet does describe initializing the 8271 using a few of calls to this. Other items of note:
  • SPECIFY accepts and applies the register writes immediately, YIELDing (sleeping / idling) between writes and waiting for the host CPU to supply the next register. This means the SPECIFY command might be partially completed for some time, and never fully completed, depending on how we program the chip. This will be significant later.
  • SPECIFY internally uses R0 as a destination index to write register values, and R1 as a count to complete. This makes it an ideal candidate to do some black box testing to confirm the chip behaves the same as our source code disassembly. Specifically, I tried and confirmed:
    • Pass the first parameter to SPECIFY as 0, indicating we are writing register values starting at R0. Note that as per the code, R0 is used internally by SPECIFY.
    • Pass the second parameter as $22. This will get written to R0 (because R0 currently points to R0) and incremented, leaving $23 in R0. This has "corrupted" R0.
    • Pass the third parameter as $48. This will get written to MMIO register R35 ($23), and should turn on the drive motor.
    • Do not pass a fourth parameter. Despite not “completing” the full four parameters of the command, we expect the writing of the third parameter to have the effect as just described. It does on real hardware.

3) External command register handling

.wakeup_COMMAND
014   02 SEL RB 2
015   F6 00 MOV I6, #$00 ; R22 ($16) = $00 (ext RESULT)
017   CF BF AND I7, #$BF ; R23 ($17) (ext STATUS) !CMD_FULL
019   00 SEL RB 0
01A   F5 01 MOV I5, #$01 ; R5 ($05) = $01 (param 3 default $01)
; That’s 1x 128 byte sector in many cases.
01C   F4 01 MOV I4, #$01 ; R4 ($04) = $01 (param 4 default $01)
01E   03 SEL RB 3
01F   98 05 MOV A, #$05 ; 5 parameters expected
021   6F 18 27 TBZ IF, #$18, $027 ; R31 ($1F) (ext CMD), jump if 5 param                                                command
; matches SCAN, FORMAT
024   2F MOV A, IF ; R31 ($1F) (ext CMD)
025   9C 03 AND A, #$03 ; (CMD & 3) parameters expected
027   00 SEL RB 0
028   31 MOV I1, A ; R1 ($01) = A = parameters expected
029   8A 3D BZ $03D     ; if no parameters, start command
02B   F0 07 MOV I0, #$07 ; R0 ($00) = $07 (put parameters at $07 down)
02D   BC 01 TASK 4, 1 ; select .wakeup_PARAM_1_accept
; 3 9 (1, 3, 9, B) = $035
02F   FF YIELD

This is where to start reading if you want to trace the main entry point into the ROM. The 8271 byte processor wakes up here where the external host CPU writes to the external command register ($FE80 on the BBC Micro). And again some interesting things to note:
  • For commands other than SCAN and FORMAT, the number of parameters expected is actually encoding in to the low order bits of the command byte. Undoubtedly, this saves space in the ROM. This is another case where a simple test can confirm the behavior claimed by the ROM code:
    • Take a command that takes 1 parameter, e.g. READ SPECIAL REGISTER.
    • Increment the command byte and supply that to the 8271 instead.
    • We'd expect the 8271 to require 2 parameters to start the command, but then ignore the second parameter and behave as if the 1 parameter version was used. This is indeed observed on real hardware.
  • Some default parameter values are deployed. These are used in the commands the datasheet calls "128 Byte Single Record Format". Presumably it is again a win for ROM space savings.

4) Full ROM disassembly

The full ROM disassembly, as of time of writing, may be found here: [link]. This copy will remain static. It's mostly complete, and all of the "main" paths are fully traced, including SEEK, READ DATA, WRITE DATA and FORMAT TRACK.

Javascript on a (1970s!) chip


It is time to look at some of the unusual sounding instructions in the ISA:
  • YIELD. $FF. This instruction tells the processor to switch to running the highest priority dispatcher request, or (more likely) go idle if there's no dispatcher request active.
  • TASK t, r. $B8-$BF, 2 bytes. This instruction changes the callback routine for the specified task. Callback routines are specified as an integer that form part of a lookup key into a table of ROM addresses.
  • YIELDTO r. $10-$1F. Change the callback routine for the currently executing task and yield.
  • SYS 0, RB, A. $EC. Jump to the ROM address at key (0, RB, A) in the address PLA. RB is the current register bank from the SEL RB instruction, and A is the accumulator value.
  • SYS 1, RB, R. $ED. Jump to the ROM address at key (1, RB, R) in the address PLA. RB is the current register bank from the SEL RB instruction, and R is current routine value.
  • SYS 2, RB. $EE. Jump to the ROM address at key (2, RB, 0) in the address PLA. RB is the current register bank.
There are quite a few concepts introduced here. If you find them somewhat jumbled, you are not alone. A lot of things about the non-traditional parts of the byte processor CPU appear very ad-hoc to me. Let's look at a concrete example: the handler for the SPECIFY command. This is the .command_SPECIFY code in example chunk 2) above.

To get .command_SPECIFY to run, the host CPU provides the command $35 to the external command register, then 1 initial expected parameter to the external parameter register. The byte processor CPU wakes up at .wakeup_PARAM_1_accept. The context here is:
  • PC == $035
  • TASK == 4
  • SEGMENT == 9, ROUTINE == 1

The fact that writes to the external parameter register wake up task 4 is hardcoded. The PC executed is determined by the routine selected for this task, which was 1 at the time. This is used to create the key (3,9,1), which looks up the address $035 in the address PLA. The address PLA is hardcoded. The fact that task 4 keys lookups as (3,9,x) is hardcoded.

.command_SPECIFY ends with this, because it is the one command that gets executed and then decides it wants to yield to wait for 3 more parameters:

06B   14 YIELDTO 4 ; seems to set PARAM callback to
; .wakeup_PARAM_4_SPECIFY, then YIELD

This unusual instruction sets the callback routine for the current task to be 4. This means that the callback code for the next external parameter register write will be keyed (3,9,4) in the address PLA. That's PC $06C, aka. .wakeup_PARAM_4_SPECIFY.

A good mental model for this is Javascript. All execution is event based; the handler for a given event can be changed (by the handler itself, or an unrelated handler); there is no pre-emption until an explicit yield.

The known events wired in to the byte processor are:
  • 0: Not reversed at all. Appears to be related to the SCAN command, which isn't enabled on the BBC Micro due to lack of DMA.
  • 1: Bit processor event, e.g. found sync, lost sync, CRC error.
  • 2: Bit processor, read byte ready.
  • 3: Bit processor, write byte needed.
  • 4: External parmeter register written.
  • 5: appears unused. (Could be permanently connected to external command register?)
  • 6: Disc drive index pulse.
  • 7: appears unused.

On top of the "normal" Javascript model, there's also the concept of task priorities. These are not visible in the ISA and are presumably hardcoded. One instance this might come in handy in a floppy disc controller is when a disc drive index pulse (once per disc revolution) fires at the same time the bit processor needs a write byte. (It's not common to write across the index, but it could happen.) In this instance, providing the bit processor a data byte is a much more real-time task than handling an index pulse, so it should be handled first.

Yes, this is quite some complexity in addition to the complexity typical in a general purpose CPU. In particular, it provides many additional ways to slice and dice control flow handling beyond the standard JMP / CALL / RET. In fact -- and somewhat painfully -- the different control flow possibilities are mixed together! To briefly get a taste of the horrors, here's the command handler for many of the sector read operations:

.command_READ_DATA
.command_READ_DATA_AND_DEL_DATA
.command_VERIFY_DATA_AND_DEL_DATA
0A7   FC C9 CALL $0C9       ; .do_common_path_from_seek
        ; very gnarly because this CALL does a YIELD
        ; context on RET is from
                                ; .wakeup_BITPROC_EVENT_1_check_header_crc
                ; if we get a matching sector header,
                                ; the RET at $255 fires
0A9   BA 0B TASK 2, 11 ; select .wakeup_BITPROC_READ_10_11_count_GAP2
        ; (3,5,(10,11)) = $1B8
0AB   12 YIELDTO 2 ; select
                                ; .wakeup_BITPROC_EVENT_2_check_for_data_marker
; (3,4,2) = $2D0
; and YIELD

As can be seen, a CALL subroutine is doing a YIELD without unwinding the stack. This should probably be considered an anti-pattern? The stack is a shared resource across all tasks, so you'd better hope that two different tasks can't trip over each by doing this at the same time. There's also the complexity that when a CALL returns, various aspects of the execution environment (task, routine, RB, etc.) may well have changed.

All this makes the ROM very hard to follow for the read and write paths, even with a fully disassembled and commented ROM!

Unusual features of the ISA


1) Lack of symmetry

These are the opcodes we found relating to incrementing, decrementing, adding and subtracting:

$80-$83: INC Ix
$84-$87: ADC A, Ix
[...]
$A0-$A7: DEC Ix
[...]
$90-$93: ADC Ix does INC Ix if carry
$94-$97: SBB A, Ix

This is interesting because there is an asymmetry between which registers can be incremented vs. decremented. Some registers cannot be incremented at all, even with register banking, because banking operates in multiples of 8 and there are only 4 increment opcodes.

One theory is that the chip designers were short on silicon space, and trying to get away without add and subtract support in the ALU. There's only circumstantial evidence to support this, such as the hacked-out ranges in the opcode space; the code generally being add / subtract free except for the code implementing disc drive seek; no compare instruction found (yet -- there are unknowns); the use of XOR to do equality checks at $240 and $24F; and the presence of only direct equality checking or mask based checking for the most common branch opcodes (see item 2) directly below). But it's fun to speculate wildly isn't it?

2) 3-byte opcodes

An entire quarter(!) of the opcode space is devoted to four conditional branch instructions:

$40-$4F:        BEQ Ix, #imm, abs branch if Ix == imm
$50-$5F:        BNE Ix, #imm, abs branch if Ix != imm
$60-$6F:        TBZ Ix, #imm, abs test and branch if zero
$70-$7F:        TBNZ Ix, #imm, abs test and branch if not zero

These opcodes are all three bytes, which is a departure from Intel's other microcontrollers of the era. It's a lot of opcode space, but they can do a lot with a little, for example this piece of code from .do_seek.

13E   62 01 3E TBZ I2, #$01, $13E   ; MMIO R34 ($22) (drive in), wait until CNT/OPI
141   72 01 41 TBNZ I2, #$01, $141  ; MMIO R34 ($22) (drive in), wait until !CNT/OPI

For a certain (less common) seek mode, these 6 bytes are sufficient to busy loop while waiting for the drive to start the seek, then acknowledge finishing it.

These three byte opcodes were found by the previously mentioned "jigsaw" analogy. Once the command entry function was fully disassembled apart from just three bytes at $021, there was only one clean possibility that fitted the required behavior.

3) No timers, port I/O or IRQs

Presumably to save silicon, the byte processor CPU has done away with timers, dedicated port I/O instructions and IRQs. The MCS-48 has all of those.

They're not needed, though. Delays, such as the millisecond range delays for seek steps, are simply timed via busy loops.

4) Decide your own adventure

At time of writing, our opcode list is here: [link]. It is not complete. The one part of the 8271 ROM we have not disassembled, SCAN handling, uses at least opcodes $9A, $9B and $8C. (We've ignored SCAN because it needs DMA wired up, which is not the case in the BBC Micro application.) Furthermore, the PLAs suggest that other opcode ranges not seen in the 8271 ROM might do something. This includes $A8-$AF and $B0-$B7. Feel free to go and have a look!

Interface to the bit processor


Now that we've got the byte processor understood and disassembled, it's time to turn our attention to interface to the bit processor and its behavior. After all, it's the bit processor that is wired to the disc drive control and data lines!

Our initial assumption was that the byte processor CPU would, like the MCS-48, have some form of port I/O instructions. This turned out to be false. The reality is simpler: it uses MMIO (Memory Mapped Input/Output). This means that access to certain register indexes change registers in the bit processor instead of the byte processor. It's quite simple: 0-31 references the byte processor registers. And the range 32-39 references bit processor registers. For simplicity of decoding, the bit processor's 8 register references are mirrored 4 times across the range 32-63. The entire 0-63 range is then mirrored 4 times in the entire addressable range of 0-255.

Registers 0-63, read on a real BBC Micro, illustrating mirroring of the bit processor registers

Furthermore, the context of bit processor references in the byte processor code makes it clear that the bit processor interface is very simple. It is so simple we didn't feel the need to reverse engineer the bit processor further. The bit processor register assignments are as follows:
  • 0: control register, 4 bits
    • Bit 0 (0x01) => gather CRC (?)
    • Bit 1 (0x02) => finish CRC (?)
    • Bit 2 (0x04) => 1 for read, 0 for write
    • Bit 3 (0x08) => idle state
  • 1: status register, indicates sync data byte type, CRC error, etc.
  • 2: drive input, read for drive status
  • 3: drive output, controls step / write / etc. lines
  • 4: clocks output byte
  • 5: data output byte
  • 6: data input byte
  • 7: unused? (returns 0xFF, and byte processor ROM relies on this!)
The astute reader might ask: does this mean the bit processor can be programmed directly with the WRITE SPECIAL REGISTER command, since the bit processor registers are MMIO? And the answer is yes! There are severe caveats however:
  • The generic command entry code corrupts bit processor state on entry.
  • The generic command exit code resets the bit processor and associated callbacks on exit (but strangely and usefully not for WRITE SPECIAL REGISTER).
  • The latency of WRITE SPECIAL REGISTER is terrible.

That last bullet, the poor latency, is unfortunate. It's about 211us, which means there's zero chance for tricks like writing data to the disc a clocks + data byte at a time. The latency is large because the byte processor executes a large number of instructions on the command entry path, doing things like checking and caching drive status, as well as reading parameters one at a time, checking if the selected drive changed, etc.

Writing the drive output register directly is useful, though. I did this for my trick to write "weak bits" directly using the 8271! See my blog post about weak bits for more details. [link]

Writing the unwritable


Now that we know how this monster works, it is of course time to turn our attention to mischief. Can we make the chip do things it is "not supposed" to be able to do? Of course we can, and as usual, it involves disobeying the datasheet:


We are specifically going to disobey the sentence that states "Issuing a command while another command is in progress is illegal." We are now equipped to see exactly what happens if we do this, by reading and reasoning about the code. The callback called when the external command register is written gets on with its job without regard for whether a command is in progress, so side effects will be:
  • The internal command register itself is corrupted, i.e. mismatched with the currently executing command. It is referenced from time to time so this may be useful to us.
  • The illegal command will change internal register values, which may impact the execution of the current command.
  • The illegal command may change or disable callbacks, or reset or reconfigure the bit processor.
Taking these things into consideration, we are going to try and write an arbitrary FM bit stream. Achieving this will enable us to recreate copy protected disc surfaces that are not supposed to be writeable with the 8271. Remember, kids:


We're going to attempt this by separating out how we write the data bytes and how we write the special sector mark clock bytes. Writing a full track of data bytes is easy, but useless on its own. We can do this by:
  • Formatting a track with a single sector header.
  • Issue a WRITE DATA command for that sector, of size 8192 bytes. A track is only 3125 bytes (with a perfectly calibrated drive), so:
    • At the first wrap around, start writing the track of bytes we want.
    • 3125 or so bytes later, at the second wrap around, abort the command by resetting the controller.
This track of data bytes alone will be useless. If you try to read a sector from it, you will get error $18, aka. "sector not found". The special clock byte markers required to identify sector headers and sector data will be missing.

The way FM encoding works is very simple: it alternates clock bit, data bit, clock bit, data bit, ... every 4us. Normally all the clock bits are 1, to maintain timing and keep the drive electronics happy. But for a sector header or data marker, a few clock bits are left out so that the floppy disc controller can locate things in a bitstream where it is not sure where it is.

So, while the WRITE DATA command is running, we're going to sneak in some parallel SPECIFY commands to "corrupt" registers for the running WRITE DATA command, without disturbing it. Specifically, we're only going to corrupt the clocks byte register (MMIO R36) at the precise times necessary to write clocks byte values other than 0xFF. The overall operation looks like this:


By some miracle, this scheme does work, and it easily replicates discs that no BBC Micro copier back in the day could come close to. One example of a tough disc, for some reason, is The Sentinel [link]. The disc has a pretty label and packaging so of course, time for a gratuitous image:


There are plenty of quirks and caveats to get this working. To briefly note them for completeness:
  • The SPECIFY command was used instead of WRITE SPECIAL REGISTER to help with latency concerns. The SPECIFY command can be "primed" by passing the command and the first parameter, such that the second parameter (first register value to write) executes the write and executes with low latency at just the right moment.
  • The WRITE SPECIAL REGISTER command exits without resetting the bit processor or clearing all the I/O callbacks. Unfortunately, the same cannot be said for SPECIFY. Fortunately, we don't have to exit the SPECIFY command! It starts having the useful side effect of writing internal registers before it is complete. And then you can later restart it again and again, never completing it.
  • The act of command dispatch corrupts the MMIO clocks byte register! This is very unexpected but it uses MMIO R36 as a temporary storage location while it is calculating which disc drive (0 or 1) is selected, and whether it changed. It is worth looking at:
.parameters_complete_launch_command
03D   FC 5D CALL $05D ; .read_drive_status
03F   BC 00 TASK 4, 0 ; select .wakeup_PARAM_0_no_action
; (3,9,(0,2,8,A)) => $030
041   03 SEL RB 3
042   2F MOV A, IF ; R31 ($1F) (ext CMD)
043   CF 3C AND I7, #$3C ; R31 ($1F) (select bits + param count masked out)
045   9C C0 AND A, #$C0 ; A now contains drive select bits
047   04 SEL RB 4
048   34 MOV I4, A ; MMIO R36 ($24) (???) (temp storage?)
049   E3 XOR A, I3 ; MMIO R35 ($23) (drive out)
04A   9C C0 AND A, #$C0
04C   8A 53 BZ $053     ; only update drive out if select bits changed
; .command_dispatch
04E   98 20 MOV A, #$20 ; bit for side select (drive 0 vs. 2)
050   C3 AND A, I3 ; MMIO R35 ($23) (drive out)
051   D4 OR A, I4 ; MMIO R36 ($24) merge back select bits?
052   33 MOV I3, A ; MMIO R35 ($23) (drive out)
; only drive select bits and side select kept
; clears write enable, head load, and others
; matches data sheet

The line in orange is the one in question. I annotated it as iffy with ??? when I first disassembled it as it looked wacky. But, it's correct and a real machine exhibits the corrupted clocks precisely as described by the code. The corrupt clocks value, $40 if writing to drive 0, is shown in orange in the above diagram. Once you know it's there, and why, the easiest way to navigate around it is to arrange for the clocks corruption to land where it doesn't case problems. When paired with a $FF data byte, it doesn't create weak bits on the disc surface, and the controller actually seems to skip over it on read.
  • The write I/O path, by some stroke of good fortune, resets the clocks byte to $FF on every write I/O callback. This saves us a lot of trouble:
.wakeup_BITPROC_WRITE_3_set_clocks_and_count_host_bytes
318   F4 FF MOV I4, #$FF ; MMIO R36 ($24), standard clocks
[...]
  • Careful timing is needed. There's a pipeline of bytes from the external data register to the internal bit processor data byte register to the actual output pulse machinery. This needs to be accounted for.
Combined, these quirks convince me that I'd have never gotten this going, or gotten close, without a thoroughly reverse engineered and disassembled ROM.

Other successes and failures


Other things enabled or demonstrated based on careful reading of the ROM code:
  • beebjit now has a much more accurate 8271 driver.
  • Unexplained weirdness trying to read a sector with logical track id $FF has been explained as an integer overflow in the bad tracks handling.
  • Sectors on a non-zero physical track, but with a zero logical track id, were believed "impossible" to read. I'm able to read them by using the "command within command" trick. Once a READ DATA command is safely underway, including having processed the seek request and gone idle, it's possible to use WRITE SPECIAL REGISTER to change R7 (command param 1, which is the requested track) to zero, and have the sector read fine. The issue is that references to logical track 0 in the seek code are always treated as a mandate to find physical track 0. You need to bypass that.
Things not achieved:
  • Reading or writing MFM (double density) -- looks fundamentally impossible. Does not appear to be a capability in the hard-wired bit processor.
  • Executing arbitrary code on the chip. This is a shame, as the byte processor is a capable CPU! Who knows what we could use a little co-processor for? Things making this hard include:
    • Separation of code and data in separate address spaces.
    • No references to the stack possible outside of CALL and RET.
    • Indirect jump targets stored in a read-only PLA. (Microsoft CFG? :)
    • ... and yes, curiously, these accidental defenses all sound similar to defensive technologies investigated or deployed since year 2000. So, the 1970s called and...

Summary

The 8271 has exceeded our expectations! Where to start? It's a massive chip, encompassing dual cores and a Javascript like execution model. Remember, this was the mid-1970s. Its general purpose CPU runs an Intel instruction set architecture that I don't believe has been publicly documented until now. It's not every day we get the treat of a new Intel ISA.

We never got to the bottom of the crazy test mode that started this whole investigation. There's no trace of it in the byte processor ROM, so it must be handled by some other component on the silicon. Something to investigate for another day perhaps.

Having seen the complexity of the chip, I must confess to a feeling of surprise every time my BBC Micro successfully loads a disc.

Epilogue

Given the 8271 issues with cost, heat, supply chain, complexity, and lack of MFM, it wouldn't be surprising if Intel had had enough with the architecture behind the 8271 and 8273. Intel staggered bravely forward with the 8272, which introduced MFM support and... hang on, let's have a look at a decapitated one of those...

Left to right: Intel 8271, NEC D765, Intel 8272

This is very cheeky! The 8272 die may say "8272 (c) Intel 1979" but it is the same die as a NEC D765, stamped "NEC D765B". It looks like Intel may have licensed the NEC design. The NEC doesn't appear much smaller in terms of die size, but the layout looks much simpler and less busy. Bizarrely, Intel appears to have fabbed the 8272 much larger than the NEC. 

Extra references

  • StarDot forum thread where the investigation unfolded: [link]
  • Sean Riddle's decap page: [link]
  • beebjit's 8271 driver: [link]
  • 8271 tests and tools (warning: rough) for the BBC Micro: [link]
  • Live document for 8271 disassembly: [link]
  • Very high resolution die shots of the 8271 and 8273 (beware, will hang browsers!): [link]