[Scummvm-tracker] [ScummVM :: Bugs] #14638: SWORD1: Crashes due to data races (multi-threading)
ScummVM :: Bugs
trac at scummvm.org
Fri Sep 22 17:48:17 UTC 2023
#14638: SWORD1: Crashes due to data races (multi-threading)
---------------------------+----------------------------
Reporter: PushmePullyu | Owner: (none)
Type: defect | Status: new
Priority: normal | Component: Engine: Sword1
Version: | Keywords:
Game: Broken Sword 1 |
---------------------------+----------------------------
Using Fedora Linux Workstation 37 x86_64 and the GOG.com/Windows/English
version of Broken Sword @ !9de51fda8a871422c580481ef6f172b64cc12751
While testing the menu rewrite (see PR !#5332 at
[https://github.com/scummvm/scummvm/pull/5332]) I observed these crashes:
This sometimes happens when moving the mouse to the inventory/manipulating
it and simultaneously pressing ESC to open the main menu, but it is not
consistent
(line numbers will differ due to added debugging code):
ERROR: SDL_BlitSurface failed: Surfaces must not be locked during blit!
{{{
#0 0x00007ffff7e08620 in SDL_GetError () at /lib64/libSDL2-2.0.so.0
#1 0x00000000006d4951 in SurfaceSdlGraphicsManager::drawMouse()
(this=0x166c1a0) at backends/graphics/surfacesdl/surfacesdl-
graphics.cpp:2394
#2 0x00000000006d0cb1 in SurfaceSdlGraphicsManager::internUpdateScreen()
(this=0x166c1a0) at backends/graphics/surfacesdl/surfacesdl-
graphics.cpp:1401
#3 0x00000000006cfe39 in SurfaceSdlGraphicsManager::updateScreen()
(this=0x166c1a0) at backends/graphics/surfacesdl/surfacesdl-
graphics.cpp:1142
#4 0x00000000006af286 in ModularGraphicsBackend::updateScreen()
(this=0x14b2560) at backends/modular-backend.cpp:187
#5 0x00000000005c015a in Sword1::SwordEngine::pollInput(unsigned int)
(this=0x27c8070, delay=0) at engines/sword1/sword1.cpp:910
#6 0x00000000005bff75 in Sword1::SwordEngine::waitForFade()
(this=0x27c8070) at engines/sword1/sword1.cpp:866
#7 0x00000000005c1063 in Sword1::Control::getPlayerOptions()
(this=0x2b16960) at engines/sword1/control.cpp:244
#8 0x00000000005bfdc8 in Sword1::SwordEngine::mainLoop() (this=0x27c8070)
at engines/sword1/sword1.cpp:834
#9 0x00000000005bf5ac in Sword1::SwordEngine::go() (this=0x27c8070) at
engines/sword1/sword1.cpp:683
#10 0x00000000005c0608 in Sword1::SwordEngine::run() (this=0x27c8070) at
./engines/sword1/sword1.h:127
#11 0x0000000000447fdf in runGame(Plugin const*, Plugin const*, OSystem&,
Common::String const&) (plugin=0x15f3620, enginePlugin=0x158e410,
system=..., debugLevels=...) at base/main.cpp:324
#12 0x0000000000449f71 in scummvm_main(int, char const* const*) (argc=1,
argv=0x7fffffffde78) at base/main.cpp:764
#13 0x0000000000445370 in main(int, char**) (argc=1, argv=0x7fffffffde78)
at backends/platform/sdl/posix/posix-main.cpp:44
}}}
In cases where the above procedure did not lead to an immediate crash, it
sometimes caused crashes later (on quitting the game or when restoring),
e.g.:
{{{
Thread 1 "scummvm" received signal SIGSEGV, Segmentation fault.
in Sword1::MemMan::flush (engines/sword1/memman.cpp):
free(_memListFreeEnd->data);
}}}
I can not reproduce these bugs consistently, but suspect they are caused
by unsynchronized access from multiple threads to MemMan and other
objects:
The game sets up a timer in engines/sword1/sword1.cpp:640:
{{{
Common::Error SwordEngine::go() {
...
installTimerRoutines();
...
}
}}}
This causes regular calls to vblCallback (which can happen from another
thread):
engines/sword1/sword1.cpp:988:
{{{
void SwordEngine::installTimerRoutines() {
debug(2, "SwordEngine::installTimerRoutines(): Installing
timers...");
getTimerManager()->installTimerProc(&vblCallback, 1000000 /
TIMER_RATE, this, "AILTimer");
}
}}}
The callback receives a pointer to the SwordEngine and uses it to call
several methods e.g. Sword1::SwordEngine::updateTopMenu().
This can eventually cause access to MemMan or different surfaces. However,
the main thread will also use these (e.g. from getPlayerOptions()).
Since there is no synchronization, this can lead to corrupt MemMan data
structures or attempts to simultaneously blit to the same surface.
To make it easier to catch this in a debugger, I addded a delayMillis(300)
call to MemMan::alloc. This backtrace then shows both threads running
MemMan::alloc():
{{{
(gdb) i threads
Id Target Id Frame
* 1 Thread 0x7ffff671a880 (LWP 317181) "scummvm"
Sword1::MemMan::alloc (this=0x1b633a0, bsMem=0x1b71ba8, pSize=2164,
pCond=2) at engines/sword1/memman.cpp:41
...
21 Thread 0x7fffd8bfa6c0 (LWP 317240) "SDLTimer"
0x00007ffff72f5045 in __GI___clock_nanosleep (clock_id=clock_id at entry=0,
flags=flags at entry=0, req=req at entry=0x7fffd8bf9340,
rem=rem at entry=0x7fffd8bf9350) at
../sysdeps/unix/sysv/linux/clock_nanosleep.c:48
...
(gdb) bt
#0 Sword1::MemMan::alloc(Sword1::MemHandle*, unsigned int, unsigned
short) (this=0x1b633a0, bsMem=0x1b71ba8, pSize=2164, pCond=2) at
engines/sword1/memman.cpp:41
#1 0x00000000005d8431 in Sword1::ResMan::resOpen(unsigned int)
(this=0x1ff3020, id=67305483) at engines/sword1/resman.cpp:257
#2 0x00000000005d81b0 in Sword1::ResMan::openFetchRes(unsigned int)
(this=0x1ff3020, id=67305483) at engines/sword1/resman.cpp:202
#3 0x00000000005e5ca2 in Sword1::Screen::showFrame(unsigned short,
unsigned short, unsigned int, unsigned int, unsigned char const*, signed
char)
(this=0x1988030, x=0, y=0, resId=67305483, frameNo=1,
fadeMask=0x1287f40 <Sword1::Menu::_fadeEffectTop>
"\001\a\005\003\002\004\006", fadeStatus=0 '\000')
at engines/sword1/screen.cpp:1385
#4 0x00000000005d2cea in Sword1::MenuIcon::draw(unsigned char const*,
signed char)
(this=0x19e35e0, fadeMask=0x1287f40 <Sword1::Menu::_fadeEffectTop>
"\001\a\005\003\002\004\006", fadeStatus=0 '\000') at
engines/sword1/menu.cpp:88
#5 0x00000000005d361c in Sword1::Menu::refresh(unsigned char)
(this=0x1c21410, menuType=0 '\000') at engines/sword1/menu.cpp:238
#6 0x00000000005d097d in Sword1::Logic::fnWipeHands(Sword1::Object*, int,
int, int, int, int, int, int) (this=0x1b9da40, cpt=0x0, id=0, c=0, d=0,
e=0, f=0, z=0, x=0)
at engines/sword1/logic.cpp:1127
#7 0x00000000005c0f92 in Sword1::Control::getPlayerOptions()
(this=0x28e6ea0) at engines/sword1/control.cpp:232
#8 0x00000000005bfdc8 in Sword1::SwordEngine::mainLoop() (this=0x2848ab0)
at engines/sword1/sword1.cpp:834
#9 0x00000000005bf5ac in Sword1::SwordEngine::go() (this=0x2848ab0) at
engines/sword1/sword1.cpp:683
#10 0x00000000005c0608 in Sword1::SwordEngine::run() (this=0x2848ab0) at
./engines/sword1/sword1.h:127
#11 0x0000000000447fdf in runGame(Plugin const*, Plugin const*, OSystem&,
Common::String const&) (plugin=0x15f4620, enginePlugin=0x2d30990,
system=..., debugLevels=...) at base/main.cpp:324
#12 0x0000000000449f71 in scummvm_main(int, char const* const*) (argc=1,
argv=0x7fffffffde78) at base/main.cpp:764
#13 0x0000000000445370 in main(int, char**) (argc=1, argv=0x7fffffffde78)
at backends/platform/sdl/posix/posix-main.cpp:44
(gdb) thread 21
[Switching to thread 21 (Thread 0x7fffd8bfa6c0 (LWP 317240))]
#0 0x00007ffff72f5045 in __GI___clock_nanosleep
(clock_id=clock_id at entry=0, flags=flags at entry=0,
req=req at entry=0x7fffd8bf9340, rem=rem at entry=0x7fffd8bf9350)
at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48
48 r = INTERNAL_SYSCALL_CANCEL (clock_nanosleep_time64, clock_id,
flags, req,
(gdb) bt
#0 0x00007ffff72f5045 in __GI___clock_nanosleep
(clock_id=clock_id at entry=0, flags=flags at entry=0,
req=req at entry=0x7fffd8bf9340, rem=rem at entry=0x7fffd8bf9350)
at ../sysdeps/unix/sysv/linux/clock_nanosleep.c:48
#1 0x00007ffff72f97f7 in __GI___nanosleep (req=req at entry=0x7fffd8bf9340,
rem=rem at entry=0x7fffd8bf9350) at ../sysdeps/unix/sysv/linux/nanosleep.c:25
#2 0x00007ffff7f09a01 in SDL_Delay_REAL (ms=<optimized out>) at
/usr/src/debug/SDL2-2.26.3-1.fc37.x86_64/src/timer/unix/SDL_systimer.c:219
#3 0x00000000004409cd in OSystem_SDL::delayMillis(unsigned int)
(this=0x14b3550, msecs=300) at backends/platform/sdl/sdl.cpp:733
#4 0x00000000005ed332 in Sword1::MemMan::alloc(Sword1::MemHandle*,
unsigned int, unsigned short) (this=0x1b633a0, bsMem=0x1b71ba8,
pSize=2164, pCond=2) at engines/sword1/memman.cpp:41
#5 0x00000000005d8431 in Sword1::ResMan::resOpen(unsigned int)
(this=0x1ff3020, id=67305483) at engines/sword1/resman.cpp:257
#6 0x00000000005d81b0 in Sword1::ResMan::openFetchRes(unsigned int)
(this=0x1ff3020, id=67305483) at engines/sword1/resman.cpp:202
#7 0x00000000005e5ca2 in Sword1::Screen::showFrame(unsigned short,
unsigned short, unsigned int, unsigned int, unsigned char const*, signed
char)
(this=0x1988030, x=0, y=0, resId=67305483, frameNo=1,
fadeMask=0x1287f40 <Sword1::Menu::_fadeEffectTop>
"\001\a\005\003\002\004\006", fadeStatus=0 '\000')
at engines/sword1/screen.cpp:1385
#8 0x00000000005d2cea in Sword1::MenuIcon::draw(unsigned char const*,
signed char)
(this=0x19e35e0, fadeMask=0x1287f40 <Sword1::Menu::_fadeEffectTop>
"\001\a\005\003\002\004\006", fadeStatus=0 '\000') at
engines/sword1/menu.cpp:88
#9 0x00000000005d361c in Sword1::Menu::refresh(unsigned char)
(this=0x1c21410, menuType=0 '\000') at engines/sword1/menu.cpp:238
#10 0x00000000005c02d8 in Sword1::SwordEngine::updateTopMenu()
(this=0x2848ab0) at engines/sword1/sword1.cpp:936
#11 0x00000000005c0412 in Sword1::vblCallback(void*) (refCon=0x2848ab0) at
engines/sword1/sword1.cpp:969
#12 0x00000000006e75eb in DefaultTimerManager::handler() (this=0x18875d0)
at backends/timer/default/default-timer.cpp:110
#13 0x00000000006d7f61 in timer_handler(Uint32, void*) (interval=10,
param=0x18875d0) at backends/timer/sdl/sdl-timer.cpp:32
#14 0x00007ffff7e5e595 in SDL_TimerThread (_data=0x7ffff7f981e0
<SDL_timer_data>) at
/usr/src/debug/SDL2-2.26.3-1.fc37.x86_64/src/timer/SDL_timer.c:168
#15 0x00007ffff7f07687 in SDL_RunThread (thread=0x18433d0) at
/usr/src/debug/SDL2-2.26.3-1.fc37.x86_64/src/thread/SDL_thread.c:305
#16 RunThread (data=0x18433d0) at
/usr/src/debug/SDL2-2.26.3-1.fc37.x86_64/src/thread/pthread/SDL_systhread.c:77
#17 0x00007ffff72ae15d in start_thread (arg=<optimized out>) at
pthread_create.c:442
#18 0x00007ffff732fc00 in clone3 () at
../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
}}}
The installTimerProc() call was introduced in commit
!e21e0ed089c60378cccb5dadda3327d679524871, so earlier versions should be
unaffected. This also means the menu rewrite is not the cause.
--
Ticket URL: <https://bugs.scummvm.org/ticket/14638>
ScummVM :: Bugs <https://bugs.scummvm.org>
ScummVM
More information about the Scummvm-tracker
mailing list