Recently I watched Trixter’s latest Q&A video on YouTube, and at 26:15 there was a question regarding PC emulators:
That got me thinking, I have some things I’d like to share on that subject as well.
First of all, I share Trixter’s views in general. Although I am a perfectionist, I am not sure if perfectionism is my underlying reason in this case though. I think emulators should strive for accuracy, which is not necessarily “perfection”. It is more of a pragmatic thing: you want the emulator to be able to run all software correctly.
However, that leads us into a sort of chicken-and-egg problem, which revolves around definitions as well. What is “all software”? What does “correctly” mean? And in the case of a PC emulator, there’s even the question of what a “PC” is exactly. There are tons of different hardware specs for PCs, even if you only look at the ones created by IBM themselves. Let alone if you factor in all the clones and third-party addons. I will just give some thoughts on the three subjects here: What hardware? What software? What accuracy/correctness?
While the PC is arguably the most chaotic platform out there, in terms of different specs, we do see that emulators for other platforms also factor in hardware differences.
For example, if you look at the C64, at face value it’s just a single machine. However, if you look closer, then Commodore has always had both an NTSC and a PAL version of the machine. Their hardware was similar, but due to the different requirements of the TV signals, the NTSC and PAL machines were timed differently. This also led to software having to be developed specifically for an NTSC or PAL machine.
As a result, most emulators offer both configurations, so that you can run software targeted at either machine. Likewise, there are some differences in different revisions of the chips. Most notably the SID sound chip. While they are compatible at the software level, the 6581 version sounds quite different from the later 8580 version. Most emulators therefore offer you to select from various chips, so that the sound most closely matches that specific revision of machine.
The PC world is not like that however. There were so many different clone makers around, and most of these clones were far from perfect, that the number of different possible configurations would be impossible to configure and emulate. At the same time, the fact that basically no two machines were exactly alike, also makes it less relevant to emulate every single derivation. As long as you can emulate one machine ‘in the ballpark’, it gives you exactly the same experience as real hardware did back in the day.
So the question is more about defining which ‘ballparks’ you have. I would say that the original IBM PC 5150 would make a lot of sense to emulate correctly, as a starting point. This is the machine that the earliest software was targeted at, and also the machine that early clones were cloning.
The PC/XT 5160 and 5155 are just slight derivations of the 5150, and the differences generally do not affect software, they only matter for physical machines. For example, they no longer have the cassette port, and they have 8 expansion slots with a slightly narrower form factor than the 5150 did.
Likewise, because most clones of that era are generally imperfect, and could not run all software correctly, they are less interesting as an emulator target.
Another two machines that make an interesting ballpark are the IBM PCjr and the Tandy 1000. They are related to the original PC, but offer extended audio and video capabilities. The Tandy 1000 was more or less a clone of the PCjr, but the PCjr was a commercial flop, while the Tandy 1000 was a huge success. In practice, this means a lot more software targets the Tandy 1000 specifically, rather than the PCjr original.
From then on, the PC standard became more ‘blurred’. Clones took over from IBM, and software adapted to this situation, by being more forgiving about different speeds, or slight incompatibilities between chipsets and such. So perhaps a last ‘exact’ target could be the IBM AT 5170, but after that, just “generic” configurations for the different CPU types (386, 486, Pentium etc) would be good enough, because that’s basically what the machines were at that point.
For me the answer to this one is simple: One should strive to be able to run all software. I have seen various emulator devs dismiss 8088 MPH, because it is the only software of its kind, in how it uses the CGA hardware to generate extra colours and special video modes. I don’t agree with that argument.
The argument also seems to be somewhat unique to the PC emulator scene. If you look at C64 or Amiga emulators, they do try to run all software correctly. Even when demos or games find new hardware tricks, emulators are quickly modified to support this.
I think this is especially relevant for people who want to use the emulator as a development tool. In the PC scene, it is quite common that demos are developed exclusively on DOSBox, and they turn out not to run on real hardware at all. Being able to run as much software as possible is one thing. But emulators should not be more forgiving than real hardware. Code that fails on real hardware, should also fail on an emulator.
An interesting guideline for accuracy/correctness is to emulate “any externally observable effects”. In other words: you can emulate things as a black box, as long as there is no way that you can tell the difference from the outside. At the extreme, it means you won’t have to emulate a machine down to the physical level of modeling all gates and actually emulating the electrons passing through the circuit. Which makes sense in some way, because the integrated circuits that make up the actual hardware are also black boxes to a certain extent. Only the input and output signals can be observed from the outside.
However, that is difficult to put in practice, because what exactly are these “externally observable effects”? It seems that this is somewhat of a chicken-and-egg problem. A definition that may shift as new tricks are discovered. I already mentioned 8088 MPH, which was the first to use the NTSC artifacting in a new way. Up to that point, emulators had always assumed that you could basically only observe 16 different artifact colours. It was known that there was more to NTSC artifacts than just these fixed 16 colours, but because nobody ever wrote any software that did anything with it, it was ignored in emulation, because it was not ‘externally observable’.
Another example is the C64 demo Vicious Sid. It has a “NO SID” part:
It exploits the fact that there is a considerable amount of crosstalk between video and audio in the C64’s circuit. So by carefully controlling the video signal, you can effectively play back controlled audio by means of this crosstalk.
So although it was known that this crosstalk exists, it was ignored by emulators, as it was just considered ‘noise’. However, Vicious Sid now does something ‘useful’ with this externally observable effect, so it should be emulated in order to run this demo correctly. And indeed, emulators were modified to make the video signal ‘bleed’ into the audio, like on a real machine.
This also indicates that there may be various other externally observable effects that are already known, but ignored in emulators so far, just waiting to be exploited by software in the future.
Getting back to 8088 MPH, aside from the 1024 colours, it also has some cycle-exact effects. These too cause a lot of problems with emulators. One reason is the waitstates that can be inserted on the data bus by hardware. CGA uses single-ported memory, so it cannot have both the video output circuit and the CPU access the video RAM at the same time. Therefore, it inserts waitstates on the bus, to block the CPU whenever the output circuit needs to access the video RAM.
This was a known externally observable effect, but no PC emulator ever bothered to emulate the hardware to this level, as far as I know. PC emulators tend to just emulate the different components in their own vacuum. In reality all components share the same bus, and therefore the components can influence each other. It is relevant that waitstates are actually inserted on the bus, and are actually adhered to by other components.
It is also relevant that although the different components may run on the same clock generator, they tend to have their own clock dividers internally, and this means that the relative phase of components to each other should also be taken into account. That is, there is a base clock of 14.31818 MHz on the motherboard. The CPU clock of 4.77 MHz is derived from that by dividing it by 3. Various other components run at other speeds, derived from that same base clock, such as 1.19 MHz for the PIT and 3.58 MHz for the NTSC colorburst and related timings.
We have found during development of 8088 MPH that the IBM PC is not designed to always start in the exact same state. In other words, the dividers do not necessarily all start at the same cycle on the base clock, which means that they can be out of phase in various ways. The relative phase of mainly CPU, PIT and CGA circuit may change between different power-cycles. In 8088 MPH this leads to the externally observable effect that snow may not always be hidden in the border during the plasma effect. You can see this during the party capture:
The effect was designed to hide the snow in the border. And during development it did exactly that. However, when we made this capture at the party, the machine was apparently powered on in one of the states that the waitstates were shifted to the right somewhat. There are two ‘columns of snow’ hidden in the border normally. But because of this phase shift, the second column of snow was now clearly visible on the left of the screen.
We did not change the code for the final version. But since we were now aware of the problem, we just power-cycled the machine until it was in one of the ‘good’ phase states, before we did the capture (it is possible to detect which state the system is in, via software. As far as we know it is not possible to modify this state in any way though, through software, so only a power-cycle can change it):
So in general I think this really is a thing that emulators MUST do: components should really interact with eachother, and the state of the bus really is an externally observable effect. As is the clock phase.
For most other emulators this is apparent, because software on a C64, Amiga, Atari ST or various other platforms tends to have very strict requirements for timing anyway. More often than not, software will not work as intended if emulation is just a cycle off at all. For PCs it is not that crucial, but I think that at least for the PC/XT platforms, this exact timing should be an option. Not just for 8088 MPH, but for all the cool games and demos people could write in the future, if they have an emulator that enables them to experiment with developing this type of code.
Related to that is the emulation of the video interface. Many PC emulators opt for just emulating the screen one frame at a time, or per-scanline at best. While this generally ‘works’, because most software tries to avoid changing the video state mid-frame or mid-scanline, it is not how the hardware works. If you write software that changes the palette in the middle of a scanline, then that is exactly what you will see on real hardware.
Because at the end of the day, let’s face it: that is how these machines work. You should emulate how the machine works. And this means it is more than the sum of its parts. Emulating only the individual components, while ignoring any interaction, is an insufficient approximation of the real machine.
Pingback: Retro programming, what is it? | Scali's OpenBlog™
Pingback: MartyPC: PC emulation done right | Scali's OpenBlog™