[Scummvm-devel] Improving translation support in ScummVM

Thu Jun 24 11:42:09 CEST 2010

On Thu, Jun 24, 2010 at 8:04 AM, Eugene Sandulenko <sev at scummvm.org> wrote:
>> What could be done on the other hand is to
>> extract the language string from the .po file in tools/po2c and store
>> it in the same structure as we also do currently with the charset.
>> That will require some perl knowledge (and I also propose to update
>> to the development po2c script, which also uses a code formatting
>> more similar to ours), but I guess that's not too hard to achieve.
> If you need some help with perl, I can do it.

Well if you want to work with that, that is fine with me actually. I
just thought I would suggest it to Thierry, since he seemed interested
in ideas about this one.

>
>> 1) Some of the locale strings use %d etc. to add numbers, strings
>> etc. to the output. While that might be fine, the major problem here
>> is: the order is fixed, so it might be hard for translators to take
>> grammar or the like into account when doing translations for those
>> messages.
> As I've mentioned, those belong to _t(). Feel free to remove that
> feature altogether. Originally this was implemented for the sake of
> completeness, but I do not believe that we need to translate debug
> output. Rather, we need to implement FR #2616463: "ALL: Display errors
> in graphic display before exiting"
>

Fine then.

>
>> 3) The whole charset idea. Personally I'm not a fan of letting every
>> translation use its own charset, which I stated on one of the patch
>> tracker items back in the days too, but either nobody cared or nobody
>> noticed that (and/or I forgot about the replies I got ;-)... Anyway
>> my major problem with that is:
> UTF-8 will kill small devices. Also if somebody will decide to
> implement it, I wish him all success with rewriting our font management
> code which these days works with single byte encodings only.
>

I guess the time our font management worked with multi byte encodings
was before my ScummVM days.... Anyway it's of course obvious that we
would need to implement a lot here (I never claimed we wouldn't). I
merely said what I think could have some improvement then someone has
time to work on it.

>> First of all our OSystem API's setWindowCaption is fixed to LATIN-1,
>> i.e. game titles will still not be translatable properly, so the user
>> might even be confused why his window title looks off, even though he
>> setup a correctly looking name in ScummVM.
> This was already discussed in the past. It will be too much to
> translate game names. Let's keep them in English as they are now.
>

Well we don't actually have to translate all game names, which wasn't
my idea here either. It might be fine enough to just use the
translated title of already translated games. And of course even
without that my statement still holds true: The user can't enter a
translated game name by his own and then except everything to work
fine :-).

>> Also our input charset is not defined at all AFAIK, thus we can't
>> even assume the user can properly enter save names with our GUI in
>> his language. Which is also a bad thing for people who want "proper"
>> language support.
> Again, proper support will be a lot of work.

I never claimed that's not a lot of work, I just said it's an annoying
bit about our translation support, which is really just a translation
support instead of a native language support, but maybe I am just
going too far with the idea of native language support in ScummVM.

>> 4) I think we still only support left-to-right languages, while of
>> course that might be fine, it might feel like kind of a
>> discrimination to people with a mother language, which is written
>> right-to-left.
> Oh yeah. What about CJK? Traditional Chinese which is top to bottom?
> </sarcasm>. Again, if somebody will implement it in a clean way, I will
> be all for inclusion, especially if it will be possible to turn it off
> at compilation time. But I believe that it is /too/ much work.

I actually think that supporting left-to-right languages is (much!)
easier than support for top-to-bottom languages... I probably agree
with you that top-to-bottom languages are too much work ;-).

About CJK support: in case we would use UTF-8 and have a proper input
charset definition (like UTF-8 ;-) I don't see what will be that hard
to implement then....

// Johannes