[Scummvm-devel] Decompiler: Documentation feedback

Mon Aug 9 17:58:22 CEST 2010

(Sorry if I missed something - there's a lot of stuff in there, so something might have slipped by the radar.)

> -----Original Message-----
> From: Max Horn [mailto:max at quendi.de]
> Sent: Monday, August 09, 2010 2:30 PM
> Cc: 'ScummVM devel'
> Subject: Re: [Scummvm-devel] Decompiler: Documentation feedback
> 
> LaTeX style remarks:
> * Consider using pdflatex instead of latex+dvips+ps2pdf.
> * It would be nice to have \usepackage{hyperref} in there; that way
> references become hyperlinks, and the resulting pdf (if using pdflatex)
> gets a TOC. Very handy. You may have to put that after variorref,
> though.

hyperref does not play nice with the DVI->PS->PDF route (linebreaks don't work), so that's why it's not there. It would be easy to add that back in if switching to pdflatex (which probably won't be any problem at all, I'm just used to using the "long" route).

The main advantage over not going straight to PDF is that if you use graphics, going to DVI forces you to use EPS (vector graphics), while pdflatex allows bitmaps, which (being raster images) don't scale well. Of course, seeing that there aren't any graphics in there at this point, I guess it doesn't hurt to change it, so I can do that later today.

> * What do you need the "url" package for? I took it out and things
> still compiled fine (note that \url is also provided by hyperref).
> * Likewise for \usepackage{bbold}

That's because I reused the preamble from an earlier document and never got around to removing the stuff I didn't need for this document. It wouldn't surprise me if there's more unused stuff in there than just those two things; I'll look into that.

> * Please don't use:
>   \verb+ _functions+
> Instead, define your own macro to mark things as code, e.g.
>   \newcommand{\code}[1]{\texttt{#1}}
> If you also want the argument to be not hyphenated, then that can also
> be achieved cleanly, w/o abusing \verb.

If having small bits of raw, inline code isn't an appropriate use of \verb, what *is* an appropriate use of \verb? \texttt will certainly get the same visual layout, but now you have to think about escaping as well, and I've been taught that's why \verb is a better choice for these things.

> 
> Section 2.1 "Adding a new engine"
> =================================
> Some more info here would be nice. You just say that one has to invoke
> the ENGINE macro. You don't mention (although I admit it is fairly
> obvious) that one also has to #include ones own header. I added:
>   #include "scummv5/engine.h"

Good point. I'll add that.

> Next, you say one needs to implement certain methods, but you don't say
> in what (sub)class one has to do that. Taken both together, you may
> want to suggest as first step that one should pick an engine name
> ("scummv5" in my name), and then create a subdir for that, including an
> engine.h file which contains a subclass of class Engine.

Well, considering it's in the section about *engines*, it feels like a natural extension that the methods are in the *engine* class. :) I'll clarify it.

> It would be nice if the getDisassembler & getCodeGenerator entries
> linked to the sections where the relevant classes are described.

Pure oversight on my part - I did mean to link them.

> Sidenote: you currently name the subclasses like v6::Engine (which is a
> subclass of ::Engine), personally, I'd prefer ScummV6Engine or so, even
> if it resides in its own namespace. Some old compilers, and more
> importantly, developers (of any age ;) are confused by classes with
> identical names in different namespaces. "Hey, you have a bug in class
> Engine" -- "Which one do you mean" ?

Yes, I can see that could be a problem. I'll put this on my TODO list.

> 
> Back to the docs. It might be a good idea to create a simple (fake)
> decompiler as a running example. Of course that's even more work, but
> would be helpful. As a somewhat less expensive solution, you could at
> least point out what the corresponding files / classes for the Kyra /
> ScummV6 engine are, so that one at least knows where to quickly find
> some real world examples. (Again, this is not *that* difficult, but if
> one argues that one can find out about things by reading the code, then
> one can quickly argue that documentation and comments are pointless in
> general... ;)

Well, you have to draw the line *somewhere*.

> 
> At the end of this section, you refer to the "Kyra2" code generator.
> This is the "name" of this engine, but the class is named Kyra::Engine.
> I found that confusing...

Yes, that's one of the things I'll rename.

> 
> Section 2.2 "Game information"
> ==============================
> First line: "it's -> "it is". This and things like "there's", "you'll",
> etc. are colloquial and should be avoided in a formal text like that.

My plan for the document was for it to be slightly less formal, since I believe that would  make it easier to understand - so that's why I've allowed myself to use contractions. Changing it is no problem, though.

> 
> 
> I would rephrase the first sentence a bit:
> "For some engines, it may not be enough to know the engine; some
> instructions may differ in behavior between different games or variants
> of a game, for example between talkie or non-talkie versions, or
> between versions for different platforms."
> 

Yeah, that would be better. Changing.

> Non-docs-remark: This is something that affects many engines in one way
> or another, in particular the SCUMM engine. E.g. for v3: You have to
> specify whether the data format is blocked or unblocked (so I *hope* it
> will be possible to auto-detect that), whether it is Indy3-256 or
> Zak256 (FM-TOWNS version).

I don't know enough about v3 to say one way or the other about that. If it can't be autodetected from the file itself, but can be restricted to specific platforms, then the platform switch (see below) would be usable there; otherwise you would either need another subclass of Engine. You could add an engine-specific switch, but (and I admit I may be missing something here...) Boost.ProgramOptions does not appear to provide a good way of doing so, unless you're just going to send a string (or an array of strings) into the engine/disassembler/code generator and let it deal with them however it wants.

> Your code currently has a "talkie" switch, but for platform variants,
> you suggest using multiple engine subclasses. Why? Why not require the
> talkie variant to use a subclass, too? Or why not allow specifying the
> platform variant with a switch? The current situation seems a bit
> arbitrary. Personally, I think allowing to specify an engine "variant"
> or "subengine" or so with a generic switch might be easier (and would
> clutter the list of engines less) than requiring lots of subclasses.
> However, we probably should wait with change to this until we determine
> which engines would be affected and how... :)

Adding support for providing the platform via the command-line is one of the things I plan to do in this last week. However, *right now*, that switch is not available, so if you were writing a new engine now, you'd have to resort to a different way of achieving platform detection - and if you can't detect it from the file, you'd have to use a second Engine for that.

> Some random remarks
> ===================
> * I noticed that Scumm::v6::Disassembler::doDisassemble generates no
> error when an unknown blockName is encountered (that should be
> resolved). In fact, there seems to be little error checking in general,
> nor a "framework" for doing that. Of course I could just invoke error
> there, but for unknown instructions an exception is used... So there is
> no clear "error handling strategy".

Correct.

> 
> * Is there any particular reason why the Parameter type distinguishes
> between 8, 16 and 32 bit values? Naively, it would seem kInt and kUInt
> are sufficient for all needs. Mind you, I am not necessarily suggesting
> to abolish those, I am just wondering whether it is important for a
> disassembler to use those correctly -- i.e. does anything outside of my
> own engine depend on these distinctions in any way? This is not clear
> to me from the docs, and in the code I only see getUnsigned & getSigned
> methods returning an int, so...

The only thing outside your engine that depends on the parameter type is the CFG output, and that's only to provide a special case for kString (to escape the string for dot output). They're pretty much there for completeness, in case you want to use that metadata somewhere in your engine-specific code. 

When it comes to actually getting the value back out, though, Boost.Variant requires me to get it by specifying the exact type I stored (There are other ways, but that becomes much more complicated to both read and write, so I decided against those...). To keep the interface simple, I only distinguish between signed and unsigned (so you don't need a getInt, getByte, getWord, getDWord, etc.); the value itself doesn't change because you use more space to store it.

Michael