[Scummvm-devel] Framebuffer

Mon Oct 23 13:44:35 CEST 2006

Hi folks,

I was wondering about the following recently: How about migrating  
OSystem's graphic API from the current "blit" based  
(copyRectToScreen) API to a framebuffer approach? This is still a  
rather fuzzy idea, I am afraid, and might lead to nothing... Maybe  
it's beneficial, maybe not, judge yourself. But first let me explain:

OSYSTEM FRAMEBUFFER API
The idea is to ditch copyRectToScreen, grabRawScreen, clearScreen,  
and *maybe* updateScreen. Instead, add the following (or a variation  
thereof; I am open to better suggestions, this is merely an example  
to help explain my thoughts):

   const Surface *OSystem::getScreenSurface();
   bool OSystem::lockScreen();   // comparable to SDL_LockSurface
   void OSystem::unlockScreen();          // think SDL_UnlockSurface

A variant would be merging lockScreen and getScreenSurface, i.e. have
   const Surface *OSystemlockScreen();

Games then could simply lock the screen surface, modify its display  
contents (blacking the screen, making a copy, applying effects,  
scrolling, etc.), then unlock it.

Syncing to the real screen could either happen instantaneously when  
unlockScreen is called (in particular if we are dealing with a "real"  
framebuffer. Alternatively, the backend could simply mark the screen  
as dirty and update it at the next suitable time. Or we keep  
updateScreen. I am not quite sure yet which approach to use, but IMO  
this is at this time a minor issue.

One obvious issue here is that dirty rect handling is harder/ 
impossible, since the backend only knows that the framebuffer was  
modified, but not which parts. This could either be mitigated by an  
additional API like
   Surface *getScreenSubsection(Rect r)
which returns a surface for only a subsection of the screen -- so in  
a way we are re-introducing dirty rects through the backdoor (though  
the backend is free to ignore this "hinting"; also note that this  
method would have a default implementation that simply calls  
getScreenSurface).

However, many engines do full screen updates anyway, so the no matter  
what we change or not, they don't benefit from dirty-rect-code. Also,  
if palette rotation is used (many SCUMM games do that), dirty rects  
are useless anyway. And while the SDL backend contains code to auto- 
compute dirty rects, this code is fundamentally flawed, and also  
doesn't work well with palette shifting.

However, at least on my system, the scaler is the bottleneck, not the  
area of the screen surface that is changed -- which by modern  
standards is usually tiny anyway (320x200, hah!). So, several  
emulators I know use the following approach (which, BTW, is also  
mentioned in patch #1574256). Namely, you keep a copy of the previous  
screen state. Then you modify your scaler to compare old and new  
state pixel-by-pixel, and only rescale those parts where the screen  
content actually changed. The drawback is an additional read-and- 
compare. However, the auto-dirty-rect code in the SDL backend already  
is doing that anyway, so nothing is lost compared to that.  
Apparently, this approach works well for some existing emulators out  
there. Not sure how well it really works, but then this idea is  
separate from the frame buffer. Oh, and before I forget it: You can  
hook this in *after* the 8bit-to-16bit conversion to be able to copy  
with palette changes nicely.

Still, all this is hypothetical. A framebuffer approach, with or w/o  
the mentioned changed scaling mechanism, might lead to slow downs; or  
to speed improvements. I believe the only way to find out is to make  
a test implementation and benchmark it.

Let me try to roughly sum up the above:

PRO:
* Actually may conserve memory whenever we currently use  
grabRawScreen. Also some engines may be able to get rid of some  
screen buffers
* At least in the SDL backend, the graphics code actually gets a bit  
simpler with this code
* I believe some other backends already operate on a framebuffer like  
system, they might be able to simplify their code, too
* On some systems, this could lead to faster operations

CON:
* Makes dirty rect handling more complicated if not impossible
* On some systems, this might actually slow down operations

Now, what do you our porters say? I'd like to hear whether this move  
would complicate or simplify your code -- or neither (besides the  
fact that you'd have work to make the transition, of course). And  
would it likely cause a speed penalty, and improvement, or again,  
nothing?

Likewise, any comments from engine authors... How would this affect  
you? Talking about the SCUMM engine, I can say that it would be very  
easy to adapt it.

Cheers,
Max