[Scummvm-devel] PS2 : stack madness

sunmax at libero.it sunmax at libero.it
Wed Apr 1 23:06:43 CEST 2009


Hi there ScummVM Team!

I apologize for the delays in following up with some of the mails,
here come all the answers (well, some at least ;-).

As everybody knows PS2 backend is in a pretty good shape for 0.13.1
but it's living on a prayer because of stack corruption which makes
it very likely to fail. Luckily so far, only COMI GMM is affected,
but -as it's often the case with this kind of issues- if you change
things around the code, or enable/disable a different number of game
engines, it could crash in many more places. So maybe with 0.14.0 we
are not going to be as lucky. All my ScummVM time has therefore gone
(and it's going) on that, cause we are living with a time-bomb now.

I was able to gather some extra information in the last few days,
here they come. Feel free to brain-storm and suggest the craziest
workarounds - it's possible that I missed a couple ;-)


@ Max // hashmap

PS2 GCC 3.2.2 cross-compiler ICEs when trying to compile some code.

My original culprit (sorry Max!) was the hashmap cause the debug
message of the compiler lead to think that it's was icing on that.

So, thanks to Max' suggestions, I reverted to the older hashmap,
and -tada!- gcc is still icing there. Which pointed me to a more
specific spot: it's not actually every _defaultVal constructor
which is causing the ICE but a single special one:

It's specifically _drawFunctions template instantiated by ThemeParser ctr.

Common::HashMap<Common::String, DrawingFunctionCallback,
 Common::IgnoreCase_Hash, Common::IgnoreCase_EqualTo> _drawFunctions;

GCC 3.2.2 does not like the "DrawingFunctionCallback" part of it.

It's a:

typedef void (Graphics::VectorRenderer::*DrawingFunctionCallback)
 (const Commmon::Rect &, const Graphics::DrawStep &);

Everything before

 _parser = new ThemeParser(this);    [in ThemeEngine ctr]

is rock solid. I was never able to trigger a stack corruption crash
before that line, but I can get one (by moving stuff around) in the
following line ctr (ThemeEval).

Since that's the portion that GCC is icing on, and that we are hacking
on PS2 to work by commenting out the _defaultVal ctr, I am inclined to
think that bogus code is produced when compiling that.

Unluckily since it's a core part of ScummVM, but it would explain why
when we backport the current PS@ code base to 0.12.x everything works
fine (cause it's still using the older Theme engine).

Max, to prove my theory: is there anyway to run ScummVM without -any-
theme? I mean not only disabling the "fancy themes" like on DS, but a
straight shot to the game?

Does the GMM always require a theme to be loaded?

If this is true (I am still not 100% sure, but it seems to explain a
few things) the cleanest solution would be to upgrade PS2 toolchains
and core libs, which I won't be able to get in place before the end
of the Summer. So all other suggestions are welcome!

Regarding PS2 GCC and stack I found this one too:

 http://lists.topica.com/lists/ps2dev/read/message.html?sort=a&mid=1714412808

It's a report for GCC 2.95 and stack bigger than 32KB - but it might
affect GCC 3.2.2 too :-(

 /*

I believe your problem is likely to do with stack size, not actual program size.

GCC 2.x (not sure about 3.x) has a problem with the stack in that the
stack increment is handled by a signed 16bit number, so a stack larger
that 32KB can cause major problems, especially with any single item on
the stack being 32Kb or larger (but in other circumstances too).

The net result of this is that you get stack corruption, and thus all
sorts of unstable results. (just moving a few variables around, can
cause your program to exception in different places).

The only real work around is to ensure that any large blocks of data
(like large structs or arrays) are put on the heap, not the stack.

I'm not sure if this is fixed in newer versions of the compiler.

Hiryu

 */

Is there anything new in ScummVM 0.13.x which could cause a massive
usage of the stack (compared to 0.12.x)?

If this is the case maybe ThemeEngine is just the first victim...

This would explain why the MADE wokaround, which moves the palette
resources to the heap, fixed it for the PS2 backend.

All insight welcome!



@ Bertrand // dlmalloc

>> I used dlmalloc (Doug Lea's classical) with success

That's was one of my early tests too. The "newlib"-centric PS2
toolchain use dlmalloc as its default allocator, I compiled ScummVM
using that and (heap) integrity checks look fine. It still crashes on
stack though.

I even enabled all the debug in the allocating functions in the ps2dev
libc but with no luck...

>> Searching for SYSV MIPS ABI on the web could give you code to do a
>> proper stacktrace, sadly I don't have such code anymore.

I could not find it. I'd love to get that!

Do you remember something more about it?



@ Andre // stack corruption

>> totally depends on the situation. Do you get the problem at the
>> same spot?

Given the same scummvm.elf binary yes.

It will always crash with the same instruction pointer.

If you compile different set of engines, or move the initialization
of things round, it might crash as early as ThemeEval ctr. But it's
always reproducible.

Good sign?


>> threading is a bitch, and such problems often result in corrupt 
>> stacks without memory protections, but debugging that is almost
>> impossible. Try to shut down additional threads.

I did with the audio thread - no change, it just starts crashing in
another place (all the time the same, always GUI / Theme related).

I can give the Timer thread a shot (ie, shut it down ;-).

Do you think it's worthy a try?


>> COMI is 640x480, maybe there's a spot where you write way beyond an
>> allocated buffer. Just for testing: Keep the drawing functions 
>> as-is and multiply the size of your gfx buffers by 4. Does it still
>> crash?

That's was my first test (cause we hit something similar in the past).

But did not help...

...as you can see from the previous part of this mail, it always 
happens in a Theme*/GUI function - but not anymore necessarily when
COMI is running :-(

Are there any buffers that need some changes between 0.12.x and 0.13.x
on the backend side? Maybe the PS2 backend did not comply with that...

>> Last but not least: You prolly noticed that I set up this buildbot
>> thingy. Are you interested in setting up the PS2 toolchain on that
>> Linux server?

Yes, I am aware of it and it's on my TODO list.

 - does it require the backend to be "configure"-friendly or the
   current hacked Makefile.ps2 would work with it ?

 - what's the complexity of getting the PS2 toolchain on it ?

I am not really lazy, but my first priority is the user base and
to provide them with a crash free ScummVM experience. If current
PS2 gcc and toolchains are buggy I would rather focus on that
so that we have a bit more recent ones asap to integrate on
the buildbot server and produce more robust ELF.

Anyway if it's a minor task and you think it might help any way,
just let me know how I can assist with that.

Ciao,
 -max





More information about the Scummvm-devel mailing list