Picking up where I left off in part 5, the subpixel-corrected polygons on Amiga:
The subpixel-correction in itself appeared to work. My analysis of the input terms for the blitter’s linedrawing appeared to be correct: you can specify the error-term in bltapt. Let us look at this formula again:
bltapt = (APTR) (2*Sdelta-Ldelta);
The Ldelta-term here represents Denominator/2, and can be replaced by a subpixel prestep, something like this:
initialNominator = fraction*2*Ldelta;
bltapt = (APTR) (2*Sdelta-initialNominator);
In other words:
initialNominator = fraction*Denominator;
bltapt = (APTR) (Nominator-initialNominator);
Where fraction would be the distance from the start of the line to the hotspot inside that pixel (so a value between 0 and 1). For completeness I’d like to point out that you have to calculate the Sdelta and Ldelta values from the coordinates at the higher resolution, rather than screen resolution, and you will need to also pre-step the x and y coordinates for your starting point before passing them on to the blitter. Important to note: the blitter can render lines in any direction (I already mentioned that in part 3, when I said I wanted to sort my lines to always render top-down, so that left and right edges would fit properly). This means that you need to perform the subpixel correction in the proper direction as well.
As an aside: I already mentioned that the Amiga Hardware Reference Manual and the System Programmers Guide had slightly different formulas. For some reason the HRM scaled everything up by a factor 2, compared to the SPG. I can see why the SPG uses 2*Sdelta and 2*Ldelta terms: this means you can just use Ldelta as your Denominator/2 value, without losing any precision. I am not sure why the HRM uses 4*dx and 4*dy terms. However, it clearly shows that applying a scale factor to all terms does not affect linedrawing.
Since I use 4-bit subpixel information, my terms are scaled up by 16 already. Therefore I chose to not use the scaled-up versions from SPG or HRM, but just use the dx and dy values as-is. I do not use a Denominator/2 term anyway, since I replace that for subpixel-correction, and I use higher precision anyway, so for me there is no real advantage there.
But it is still broken…
So far, nothing new. I have just given a slightly more in-depth explanation of how the idea of subpixel-correction on blitter lines works. For line-drawing, this works well enough. However, as we’ve seen above, polygons are still buggy. So what is wrong here?
Well, this again has to do with the blitter rendering rendering lines in different directions. As long as the blitter renders them top-down (the cases where abs(dx) < abs(dy)), things work as expected for polygon edges. However, for the cases where it renders them left-to-right or right-to-left (abs(dy) < abs(dx)), there is that problem again of making the edges meet properly. The line may not end exactly on the scanline you intended, and as a result, there is either a pixel too few or a pixel too many on the scanline, causing the filler to overshoot and fill the entire scanline.
At first I tried to mess about with the line setup to try and fix this problem, but so far I have not come up with a working variation. So then I decided to make a hybrid routine instead: for the cases where abs(dy) < abs(dx), I use a CPU routine, which still draws it top-down, and puts pixels in all the right places. Luckily, these are generally the lines that require the fewest pixels (since we always draw top-down, abs(dy) is the number of pixels we draw), so the blitter still takes care of most of the workload. The CPU routine can also work in parallel with the blitter, so it is not even all that bad.
The only problem-case here is when the polygon is less than 2 pixels wide. If both edges draw at the same pixel, the filler will not work properly, as we’ve seen in part 3. Back then I proposed the solution of using XOR-mode when drawing pixels. This way, when two pixels are drawn on top of eachother, the second pixel will turn the first pixel back off, so the filler will not do anything there.
This solution works perfectly for our hybrid subpixel-correct renderer, since we now render exactly 2 pixels on every scanline. So we use the blitter to draw in XOR-mode, and we also use a XOR operation to draw the pixels with the CPU. We do not need any other tricks, like throwing away the first pixel of a line. And there we have it then: a blitter-accelerated subpixel-correct polygon drawer on the custom hardware of a 1985 home computer:
I am getting 4-bit subpixel precision here, which is as good as early 3d accelerators from the mid-90s on PC. Quite bizarre actually. Is this just an undocumented feature? I don’t recall ever having seen subpixel-correct lines or polygons on a regular Amiga. But as usual, Amiga makes it possible!
On to some other old junk
Before I end this post, I would like to share some other small things that I have made in the meantime. Namely, on PC I had made CGA, EGA and VGA-optimized polygon fillers. But there are more early graphics standards. One of them is Hercules, which is actually the first graphics standard I ever used on PC. My first PC came with a Plantronics onboard adapter, which was compatible with both Hercules and CGA, and also had a special 16-colour mode. At first I only had a monochrome monitor, so Hercules was all I could use. It wasn’t even that bad, really. Sure, it was monochrome, but the resolution was 720×348 pixels, which was incredibly high at the time. CGA could only do 640×200, EGA did 640×350, and VGA did 640×480.
Anyway, I decided to give it a go. I tried to look at the ah=0 int10h setvideomode function to see which mode it would be… Shock! Horror! There *is* no mode for Hercules. Apparently Hercules does not have any BIOS API, so the only way to set a videomode is to manually reprogram all registers. Luckily I found the right register settings on the internet somewhere. And before long I could switch to graphics mode and back.
Then I had to figure out how to address each pixel in memory. Hercules is quite quirky that way. The scanlines are stored in an 4-way interleaved arrangement. Each scanline is just as you expect: 720 pixels packed into bytes, giving a total of 720/8 = 90 bytes. But the addressing of the scanlines is like this:
Y MOD 4 == 0 at B000:0000 + (Y/4)*90
Y MOD 4 == 1 at B000:2000 + (Y/4)*90
Y MOD 4 == 2 at B000:4000 + (Y/4)*90
Y MOD 4 == 3 at B000:6000 + (Y/4)*90
So, now that the addressing is worked out, it’s time for the final details. Hercules uses a Motorola 6845 CRT controller, just like CGA (and EGA/VGA are near-100% compatible extensions of the 6845). The main difference is that monochrome adapters have their I/O ports based at 3B0h rather than at 3D0h for colour adapters (so that both can co-exist in the same system). Hercules comes with 64kb of memory, which means it supports 2 pages of memory. A single screen takes 720*348/8 = 32kb of memory. The second page is at segment B800. This is the same segment as is normally used by CGA. Which means that you can use the second page, but only if you do not have CGA-compatible card in your system as well (the first page is always available, so dual monitor setups are possible, as long as one is MDA/Hercules and the other is CGA/EGA/VGA-compatible).
Assuming that the Hercules card is the only one in the system, we can use double-buffering in video memory, just like on EGA and unchained VGA modes. Porting my polygon routine was quite straightforward from here on in. There was a slight problem however: the routine only supported flatshading, and Hercules has only two shades: black and white (or amber, green, or whichever other colour your monochrome display may use). So I decided to implement a simple dithering scheme, so that you could discern the individual faces:
Yes, it’s rather flickery, because the vsync does not appear to work correctly. I’m not sure if that’s dosbox’ fault, or if the vsync bit on the 6845’s status register does not work on real Hercules hardware. But it will have to do.
PCjr and Tandy
Although I have now REALLY covered every single graphics card I ever owned, there was still one graphics standard that was reasonably popular in the early days: the enhanced 16-colour mode of IBM’s PCjr, and the clones made by Tandy. Okay, I have no support for the Plantronics mode on my first PC, but I no longer have that PC, and I don’t think dosbox is compatible with it… It seems easy enough to add support for it though: It is like CGA, but with two extra even/odd bitplanes at segment BC00h. It combines the 2-bit pixels from B800 and BC00 to a 4-bit pixel.
Right, now onto PCjr/Tandy, because that mode IS supported by dosbox. This is yet another 16-colour mode, it does not work like EGA, and not like Plantronics either. Instead, it uses a packed-pixel format like CGA, so now there are two 4-bit pixels packed into each byte. And where CGA has even/odd planes at B8000 and BA000, PCjr/Tandy has 4 scanline-interleaved planes, much like Hercules, at B800, BA000, BC00 and BE00.
So PCjr/Tandy does not lend itself very well to fast polygon filling. With just 2 pixels per byte, and no special trickery to fill multiple planes at a time, it is not going to be all that efficient. But I’ve implemented it anyway, just to complete the whole set of graphics adapters:
And well, that’s it for now. I am not sure what I am going to do next. As I already mentioned in part 1, I may explore the graphics capabilities (or lack thereof) of the Commodore 64, or I may evolve these simple polygon routines into a more complete engine, allowing some simple objects to be animated on screen.