[Scummvm-devel] Broken Sword 2 problem

yotam barnoy yotambarnoy at gmail.com
Fri Sep 3 09:59:25 CEST 2010


On Thu, Sep 2, 2010 at 7:32 PM, Norbert Lange <lange at chello.at> wrote:
>
> I tried similar things when I wrote the improvements for unaligned access
> in endian.h, "aligned" is only used for specifying a minimum and thus
> making alignment bigger, it will still be greater (a multiple) of the
> type-size.

That's unfortunate :(

> I dont know of another way to get gcc to emit unaligned access
> instructions and I spent quite some time fiddling with it, ironically MS`
> compilers have a beautiful "unaligned" declaration.. which doesnt mean
> much since the only architecture that could benefit from it is the
> deceased Itanium.

Yeah hopefully I got a couple of the big shot GCC folks interested in
making it easier to emit unaligned instructions, especially since one
essentially has to break one thing (aliasing) to make another
(unaligned instructions) work.

> I`d wonder though if the case for architectures that dont need aligning
> (line 169-185) have the very same aliasing issues thanks to casting the
> pointer, only the fallback-case is safe by using char-type accesses.

I would say that the answer is yes. It's undefined and there's no
difference between uint32 and a struct containing uint32 in that
instance. However, strict aliasing essentially makes expanding a
pointer to uint32 * impossible, and since it's used in many
applications for efficiency, I think gcc screens for uint32 * and does
it properly even though it's technically undefined.

>
> Im not sure if I understood the C99 specs right tho - as far as I
> understand, only the effective memory-accesses should count, uint32 should
> be the same whether accessed through a struct or directly (which makes
> sense). On one hand, as long you keep using the same amount of bits for
> reading/writing all accesses should use the same uint16/uint32 type and
> the compiler should be aware of aliasing issues.... on the other hand
> apparently this dint safe the psp-toolchain from producing "wrong" code.
> Im still not sure if this is a compiler bug?!

I've read just about anything I could find on the web about this now,
and I think nobody really fully understands the C99 specs, including
the committee itself (they disagree on some examples). As far as I can
tell, strict aliasing may be a bonus for compilers, allowing some
nifty optimizations, but it really doesn't fit well with the C
paradigm. To summarize my findings for those who aren't in the know on
this topic:

- Aliasing means 2 pointers point at the same thing in memory. C
pointers give compilers headaches, because the compiler never knows if
it can optimize a read/write away or to put it elsewhere (like outside
a loop so it's not repeated) or if another pointer points to the same
place in memory, which could make the optimization incorrect.
- To solve this issue and help compilers do a better job, C99 decided
on some aliasing rules, so compilers have guidance regarding when
things alias and when they don't.
- gcc has allowed you to use -fstrict-aliasing to use similar
guidelines to C99 in the past, but in recent versions they made
-fstrict-aliasing *default* for -O2 and -O3, presumably because the
really good optimizations need strict aliasing rules.
- These rules basically make every cast of a pointer dangerous. Every
time you cast a pointer, an angel loses its wings and you might be
creating a bug that *may* only crop up years later (as is the case
with this BS2 bug which has been lingering there for a long time).
This is all very unfortunate as pointers and casting pointers are
basic features of C and are used widely. It means that a whole bunch
of software has become buggy without even knowing it.
- When judging whether a pointer cast is ok, what you really want to
look at is the lvalue ie. what am I assigning *to* not what I'm
assigning from.
- An assignment from a pointer type to the same pointer type is safe
(e.g. uint32* to another uint32* pointer). The compiler will realize
that the 2 pointers are aliasing.
- Any assignment using a pointer that's not the same as the original
pointer/data value is dangerous. *Even* a modified pointer type e.g.
unsigned to signed is dangerous if you're going via double pointers.
Usually though, changing a pointer from unsigned to signed and such
things is not dangerous.
- The one exception to the rule is when the lvalue is a char/unsigned
char pointer. This pointer is always assumed to alias, so the compiler
will never optimize it away, which is both good for the sanity of the
programmer and bad for the compiler's optimization.
- Void * doesn't seem to help much. It obviously can't be an lvalue on
its own (can't be assigned to), and when used as in for example in
READ_UINT32, the compiler 'skips' the void * and sees what was there
previously and compares the new pointer to that. Thus, our struct
aliases with a char array (or whatever else is there when the function
is inlined), NOT with a void *. The char array doesn't help in this
case because remember, it's all about the lvalue, and the char array
is the rvalue here.
In an efficient memcpy that copies 32bits at a time you would have the
same problem. If the compiler doesn't inline the memcpy, you could
luck out, but if it does inline it, the compiler will try to optimize
it with whatever's there (a uint16, char etc) and end up getting
confused by the aliasing.
- gcc allows a trick whereby if you declare a union of the original
type and the new type, it realizes that there's aliasing going on.
This is *not* supported and is even discouraged by C99. I found this
code which seems interesting though super ugly:
#define CAST(type, x) (((union {typeof(x) src; type dst;}*)&(x))->dst)
and then CAST(unsigned short, a) = 4;
- I personally am not crazy about the union solution and since it's a
gcc hack unrecognized by the standard, I'm just as happy with
attribute((__may_alias__)), which can be easily put in a #define.
- The restrict keyword doesn't exist yet in C++, but g++ has
__restrict__. In C99, restrict allows you to qualify that even
pointers that can alias (ie. same type) will not alias. This allows
the compiler to do optimizations even on pointers that according to
the rules will alias. Supposedly, doing this can sometimes give a huge
performance boost and sometimes only a modest one. Reports are of
speed increases between 10% and 500%, with the lower percentages
dominating. In my opinion, all the aliasing rules should have been
thrown out and only restrict should matter. If we have to worry about
the compiler's optimizations, let us do it voluntarily.
- Linus's view: http://lkml.org/lkml/2003/2/26/158
http://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg01647.html
- I have to say that after reading and reading and seeing how nobody
really gets this concept and how it kills the C paradigm, in my
opinion Linus is basically right.

>
> PS. If you are using a recent psp-toolchain then __builtin_bswap32 is
> natively supported and preferable to the now used inline-assembly, my
> patch for the toolchain got included sometime last year. taking out
> "defined(__psp__)" at line 73 should be enough

I don't know if I have the latest toolchain or who might be building
so I'll leave it for now, but thanks.

Later,
Yotam

>
>>
>> Hopefully __may_alias__ will do the trick.
>
> yeah, should do.
>
> Norbert
>
>> Yotam
>>
>>
>> On Wed, Sep 1, 2010 at 2:37 PM, Willem Jan Palenstijn <wjp at usecode.org>
>> wrote:
>>> On Tue, Aug 31, 2010 at 06:26:41PM +0300, yotam barnoy wrote:
>>>> For the (hopefully) last episode in this bug saga, see commit 52473.
>>>
>>> Wow, very nice catch! Well done!
>>>
>>> If I understand your commit message correctly, the issue is that some
>>> of the
>>> READ_*_UINT* functions break strict aliasing rules? It would probably
>>> be good
>>> to try to fix that instead of using -fno-strict-aliasing, if possible.
>>>
>>>
>>> The subtleties of these strict aliasing rules still elude me, but it
>>> might be
>>> the case that the issue is that the two Unaligned32 structs are local
>>> to their
>>> functions, and might be considered different.
>>>
>>> I'm trying with this code:
>>>
>>> inline __attribute__((__always_inline__)) int read_value(const void*
>>> ptr)
>>> {
>>>        struct Unaligned32 { int val; } __attribute__ ((__packed__));
>>>        return ((const Unaligned32 *)ptr)->val;
>>> }
>>>
>>> inline __attribute__((__always_inline__)) void write_value(void* ptr,
>>> int value)
>>> {
>>>        struct Unaligned32 { int val; } __attribute__ ((__packed__));
>>>        ((Unaligned32 *)ptr)->val = value;
>>> }
>>>
>>> int test(char* x) {
>>>        int y = read_value(x);
>>>        write_value(x, y+1);
>>>        return read_value(x);
>>> }
>>>
>>> When compiled with g++ 4.5.1 on x86_64 with -O3 and strict aliasing, it
>>> returns
>>> y, and not y+1.
>>>
>>> When I move the definition of Unaligned32 out of the functions, it
>>> returns y+1:
>>>
>>> struct Unaligned32 { int val; } __attribute__ ((__packed__));
>>> inline __attribute__((__always_inline__)) int read_value(const void*
>>> ptr)
>>> {
>>>        return ((const Unaligned32 *)ptr)->val;
>>> }
>>> inline __attribute__((__always_inline__)) void write_value(void* ptr,
>>> int value)
>>> {
>>>        ((Unaligned32 *)ptr)->val = value;
>>> }
>>>
>>> int test(char* x) {
>>>        int y = read_value(x);
>>>        write_value(x, y+1);
>>>        return read_value(x);
>>> }
>>>
>>>
>>> Does it behave the same for you on your platform?
>>>
>>> -Willem Jan
>>>
>>
>> ------------------------------------------------------------------------------
>> This SF.net Dev2Dev email is sponsored by:
>>
>> Show off your parallel programming skills.
>> Enter the Intel(R) Threading Challenge 2010.
>> http://p.sf.net/sfu/intel-thread-sfd
>> _______________________________________________
>> Scummvm-devel mailing list
>> Scummvm-devel at lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scummvm-devel
>
> ------------------------------------------------------------------------------
> This SF.net Dev2Dev email is sponsored by:
>
> Show off your parallel programming skills.
> Enter the Intel(R) Threading Challenge 2010.
> http://p.sf.net/sfu/intel-thread-sfd
> _______________________________________________
> Scummvm-devel mailing list
> Scummvm-devel at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scummvm-devel
>




More information about the Scummvm-devel mailing list