[Scummvm-devel] Minimal screen updates and possible OSystem improvements

Tue Nov 10 18:58:10 CET 2009

Vladimir Menshakov wrote:
> So another real world example, allowing quick porting of the TeenAgent engine to the new api. 
>
> TeenAgent has several layers: 
> 1) background
> 2) animation slots 0-3
> 3) custom animation slots 0-3
> 4) actor
> 5) foreground slots (different numbers per scene)
> 6) foreground overlay. 1
>
> All animation and sprites uses colorkey 0 for transparent pixel. 
>
> So the render cycle in unoptimized engine looks like
> 1) copyRectToScreen(background)
> 2) render all animations ( using lockScreen, as OSystem does not have colorkey blit). 
> 3) render messages (using lockScreen again)
> 4) updateScreen(), wait and goto 1. (we have 1-2 fullscreen memory copying for each frames depending on backend). 
>
> How could we optimize it? Let's add generic class: 
>
> ScenePlanner {
> void add(rect) {
>    for(i = dirty.begin(); i != dirty.end(); ++i) {
>      if (i->contains(rect)) 
>        return;
>      if (i->intersect(rect)) {
>        i->extend(rect); //simplification, just for concept illustration. 
>        return;
>      }
>   }
>   dirty.push_back(rect);
> }};
>
> So, optimized scene would look like: 
> 1) for each rectangle from planner call copyRectToScreen().
> 2) render all animations (LockRect(animation_frame, Read | Write) for colorkey blit), adding rectangles to planner.
> 3) render messages, adding bounding boxes to planner
> 4) wait and goto 1. 
>
> Backend would collect dirty rectangles and blit it with hardware or software to backbuffer or directly to video memory - we could tune it for each platform. 
>
> No backbuffer in engine, no extra surfaces allocated in engine, removing custom tricky logic from each engine. Less code - less bugs. 
> PROFIT 

At least for KYRA we can't get rid of the various backbuffer(s), because 
scripts do their own blitting on them sometimes, thus we can't find an 
(easy) way to emulate the same behavior with only one screen buffer. The 
only possible gain KYRA would have from lockScreen/unlockScreen not 
forcing the screen to redraw everything is keeping no separate front 
buffer, which is currently a memory buffer, which holds the same data as 
the backend screen.

Anyway so far I don't know of any backend, which has any speed problems 
with KYRA graphics wise, there were some problems on PSP, because PSP 
did a forced redraw on OSystem::updateScreen, even though the 
documentation specifies that the backend should ignore updateScreen 
calls, when the screen wasn't changed.

Also I doubt your suggestion would solve the dirty rect problem in 
general, some engines might have no idea, where the scripts etc. change 
the screen, thus they would still need to lock the whole screen and thus 
force a whole update.

I'm not familiar with the Groovie code, but it let me explain how I 
understand the graphics code works: (This will be an example of an 
engine which doesn't do any dirty rect handling currently and for which 
OSystem API extensions won't give any benefit)

There are two different framebuffers "foreground" and "background". Both 
can be copied to screen via GraphicsMan::updateScreen to the OSystem's 
screen buffer, each time with a full redraw of the whole screen. (The 
code for displaying the foreground buffer on the screen in VDXPlayer 
seems to be commented out though).

Groovie seems to play almost exclusively video data. As far as I can 
tell the VDXPlayer at least seems to be based on graphics data, which 
has various 4x4 image blocks. Now frames of those files can be either 
drawn to the foreground or the background.

There are also Opcodes, which modify those buffers "o_copybgtofg" 
(copies *all* of the background to foreground), "o_copyrecttobg" copies 
a part of the foreground to the background. And there's 
GraphicsMan::mergeFgAndBg, which seems to copy data from the background 
to the foreground depending on the pixel value of the foreground.

As far as I can see at least one of those buffers must be present in the 
engine, the other *could* be kept exclusively in the OSystem's screen 
buffer. Seeing that both foreground and background could be modified 
without any changes to the real screen output, that would sound a bit 
risky though, since one buffer might be updated, but not at all 
displayed (or only in a later state).

Thus I would *guess* that Groovie is one of the engines, which wouldn't 
benefit from your API improval purposal, since it would still need some 
internal way of keeping track of which screen parts really changed.

Of course I'm happy to stand corrected on this (inclusively my thoughts 
about how Groovie's display part works ;-).

Actually I *like* the idea of making it possible to lock only parts of 
the front buffer (or even flag it as a read only access), that *might* 
lower the memory footprint (and increase performance) under certain 
circumstances, as when currently the engine keeps a separate front 
buffer, which only duplicates the OSystem's screen buffer for easy 
access. When this is the case (as in KYRA i.e.) this would usually mean 
for a real screen update it would work like this:

1) engine draws things on it's front buffer (means it copies data from 
various buffers, maybe it's back buffer, to the front buffer)
2) copy engine's front buffer's changes to the OSystem's front buffer 
(this might be a full screen updates in engines, which do not support 
dirty rect handling!)
3) call updateScreen

It's quite easy to see, when we have fast area specific locking of the 
game screen, the first step might be left out and instead the front 
buffer from OSystem itself could be used. This would reduce bandwidth 
usage and CPU usage, since less copies would need to be done.

Of course that wouldn't imply that magically the engine would now only 
lock the specific parts of the screen, when it did only do the second 
step as full screen update. So even with this API extension, it would 
only *help* engine authors to implement faster dirty rect aware screen 
updates (in the end the engine itself needs to determine somehow, which 
parts of the screen need to be flagged as changed!), but *not* magically 
enable dirty rect handling in all engines.

IMHO currently most engines, not supporting dirty rect updates, have the 
problem that they don't keep any track which parts of the screen changed 
and instead only copy their full image data to the front buffer. The 
reasons *why* they don't keep any track of this is IMHO not the OSystem 
API, but rather how the engine works internally/is written.

// Johannes