Maximizing 7800 Sprites
Contents
7800 Sprite Capability Overview
The DLL DL structure of the 7800 is seemingly fast and flexible - put a 4 or 5 byte structure on a display list, and let MARIA handle all of the rendering work. What could be easier, right? Indeed, MARIA is capable of drawing a lot more sprites than other contemporary console; one look at the 7800 Robotron port will make that apparent. Even better, MARIA has minimal cycle penalties for drawing very wide sprites, displaying huge sprites in a way that isn't possible on other contemporary consoles.
And yet its a bit more tricky in practice. A sprite usually takes up entries on 2 DLLs, since it will usually be crossing between DLLS, rather than fitting inside one. And it may just be 4 or 5 bytes, but those bytes often require calculations to be made, and the DLLs can't be updated during the visible screen without causing glitches - instead the DLLs need to be updated during the shorter non-visible portion of the video frame. Add all of these up, and you'll soon start to realize that the 6502 is the bottleneck in the sprite display process. Benchmarks with general sprite routines will top out between 20 and 30 sprites per frame, so What's a developer to do when they want to display more?
Visible Display vs VBLANK
If your program is waiting for MSTAT to indicate the visible screen is over, you're wasting a whole lot of available CPU time. There are likely a lot of zones between the end of your visible display and VBLANK. The best approach is to set an NMI to flag when your active display is over, at which point you can start updating DLLs.
Split Logic
The simplest thing a 7800 developer can do to improve performance in their game is to split out the game logic from the display logic. Enemy AI, player controls, display position calculations and everything else aside from adding bytes to the DLLs should be happening during the visible screen, where the most free time is available. When your program is done all of the game logic, it should just be ready to write out that DLL data as fast as possible once the visible screen is over.
Ensure you do this by splitting your main loop into 2 sections, the first labeled as ";GAME LOGIC" and the other as ";DISPLAY LOGIC". If anything in the DISPLAY LOGIC section can be precalculated in the GAME LOGIC section, even if takes more time, do so.
Characters, not Sprites
Where possible, use an indirect DLL entry pointing to memory, and add it to your DLL before the main loop. This is "baking in" the object - adding an object to the beginning of your DLL and never updating its DLL entry during the main loop. Instead, if you just change the bytes the DLL is pointing at it's likely to be a big win. This applies to score information, backgrounds, and anything else that doesn't need the flexible positioning a sprite requires.
Just keep in mind that this advice needs to be tempered a bit. MARIA takes more DMA time to display characters than sprites. Any scheme that involves multiple layers of characters will almost certainly fail, since MARIA doesn't have enough DMA time for multiple layers of full-screen characters.
Sprites, not Characters
Even better than using "baked-in" characters, is the use of "baked in" sprites. Use these instead of characters for background details, as MARIA can chew through these faster, leaving more CPU time for your DLL preparations during the visible screen.
Gracefully Losing the Battle
Despite your best efforts, you may run into an occasional dropped frame. MARIA needs to keep marching to the beat of the display clock, and won't wait if your GAME LOGIC code runs long. While you should do your best to avoid this, its also good to ensure that prematurely displayed DLLs won't cause screen glitching if your program hasn't completed its DISPLAY LOGIC section when MARIA starts looking at your DLL structures.
Avoiding glitches generally means that your DLL structures are terminated with zero entries. One approach to doing so is adding a terminating zero for the next DLL structure bfore adding to the current one. IMO this is counter-productive, and wastes a lot of cycles on each DLL update.
Instead, you likely have a routine that ensures all DLL structures are terminated prior to display. If you modify the routine to ensure that all DLLs are doubly-terminate, most glitches caused by partial updates will be prevented or be minimal.