[Scummvm-devel] Wintermute profiling
Willem Jan Palenstijn
wjp at usecode.org
Sun Aug 25 12:38:21 CEST 2013
On Sun, Aug 25, 2013 at 10:18:45AM +0200, Tobia Tesan wrote:
> Hi wjp, and wow!
> Il 24/08/2013 17:57, Willem Jan Palenstijn ha scritto:
> > By far the most expensive bits in these two screens turn out to be the loops
> > over the _renderQueue in drawSurface and the functions it calls.
> Yes, more or less the same I found myself, but I find blit() to be
> (well, obviously) very costly also.
That's strange, as there's hardly anything to actually redraw in those two
Maybe that is caused by the bug fixed by
which can severely grow the redraw area unnecessarily.
Or are you talking about different scenes maybe?
> > One thing I only explicitly realized due to this profiling is that calling
> > _renderQueue.size() (_renderQueue is a Common::List) is very expensive as it
> > doesn't maintain the current size of the list and iterates through it entirely.
> Interesting. This could be fixed quite easily, perhaps.
> Although, when I profiled with MSVC it did not look particularly hot (I
> remember somearray.size() being slightly bigger than expected, but that
> could simply be because of the sheer amount of calls).
> Could this perhaps be a matter of optimizations?
I would expect it to be inlined when optimizations are enabled, yes, so it
won't immediately show up in a basic function-based profile. (Hence the more
fine-grained manual instrumentation I mentioned earlier.)
> > It would be interesting to see what other problematic screens there are in
> > Wintermute games, and how this interacts with t0by's dirty rect improvements.
> > All these things combined should lead to very nice results I think.
> Let me get done with UITIledImage - which is the only thing that, I've
> found, hinders the effectiveness of said rect changes -, reprofile the
> whole deal and optimized when needed (I may have been misled by said
> UITIledImage when working on the rect system) and then we can start
> having fun.
How does it hinder the effectiveness exactly? Too many rectangles causing the
new rectangle merge/split code to be slow? Or something more subtle?
Would it work to just batch the dirty rectangle updates? I.e., on startBatch
start a new temporary rectangle and just naively (and cheaply) extend that with
new rectangles. Then when endBatch is called, send the new combined rectangle
for the entire batch to the main multi-dirty-rect system?
More information about the Scummvm-devel