[Scummvm-devel] Decompiler: Documentation feedback

Mon Aug 9 18:55:05 CEST 2010

Am 09.08.2010 um 17:58 schrieb Michael Madsen:

> (Sorry if I missed something - there's a lot of stuff in there, so something might have slipped by the radar.)
> 
>> -----Original Message-----
>> From: Max Horn [mailto:max at quendi.de]
>> Sent: Monday, August 09, 2010 2:30 PM
>> Cc: 'ScummVM devel'
>> Subject: Re: [Scummvm-devel] Decompiler: Documentation feedback
>> 
>> LaTeX style remarks:
>> * Consider using pdflatex instead of latex+dvips+ps2pdf.
>> * It would be nice to have \usepackage{hyperref} in there; that way
>> references become hyperlinks, and the resulting pdf (if using pdflatex)
>> gets a TOC. Very handy. You may have to put that after variorref,
>> though.
> 
> hyperref does not play nice with the DVI->PS->PDF route (linebreaks don't work), so that's why it's not there. It would be easy to add that back in if switching to pdflatex (which probably won't be any problem at all, I'm just used to using the "long" route).

Well, I already suggested (in private communication) to switch to pdflatex ;), soo... :)

> 
> The main advantage over not going straight to PDF is that if you use graphics, going to DVI forces you to use EPS (vector graphics), while pdflatex allows bitmaps, which (being raster images) don't scale well. Of course, seeing that there aren't any graphics in there at this point, I guess it doesn't hurt to change it, so I can do that later today.
I am not sure I get your point here. Are you saying you prefer to go via DVI, because it prevents you from using raster/pixel graphics? That seems like a rather weak reason (you could simply not rasterized graphics with pdflatex, too ;).

[...]

>> * Please don't use:
>>  \verb+ _functions+
>> Instead, define your own macro to mark things as code, e.g.
>>  \newcommand{\code}[1]{\texttt{#1}}
>> If you also want the argument to be not hyphenated, then that can also
>> be achieved cleanly, w/o abusing \verb.
> 
> If having small bits of raw, inline code isn't an appropriate use of \verb, what *is* an appropriate use of \verb? \texttt will certainly get the same visual layout, but now you have to think about escaping as well, and I've been taught that's why \verb is a better choice for these things.

If you were quoting code involving special characters (e.g. HTML tags like "<table>") , and not just identifiers, I would understand the use of \verb (although I still would consider a custom, \verb like macro better -- just like I consider using hyperref's \url, which is a \verb-like macro, better for URls than \verb). But for identifiers only, essentially the only things you get from using \verb is (a) it switches to the typewriter font, and (b) it disables hyphenation.

Most importantly, using a macro also serves to separate the semantics from the presentation/style. If we wanted to switch to printing code in an orange calligraphic font, we'd then have to modify only one place, namely that macro. Right now, we'd have to track down all uses of \verb, and decide whether it is used for quoting code, or something else.
Moreover, \verb has many subtle and dangerous effects. E.g. it is a so-called fragile command, meaning you can't use it in section headers and certain other places.

[...]

>> 
>> Back to the docs. It might be a good idea to create a simple (fake)
>> decompiler as a running example. Of course that's even more work, but
>> would be helpful. As a somewhat less expensive solution, you could at
>> least point out what the corresponding files / classes for the Kyra /
>> ScummV6 engine are, so that one at least knows where to quickly find
>> some real world examples. (Again, this is not *that* difficult, but if
>> one argues that one can find out about things by reading the code, then
>> one can quickly argue that documentation and comments are pointless in
>> general... ;)
> 
> Well, you have to draw the line *somewhere*.

Sure. 

[...]

> 
>> Non-docs-remark: This is something that affects many engines in one way
>> or another, in particular the SCUMM engine. E.g. for v3: You have to
>> specify whether the data format is blocked or unblocked (so I *hope* it
>> will be possible to auto-detect that), whether it is Indy3-256 or
>> Zak256 (FM-TOWNS version).
> 
> I don't know enough about v3 to say one way or the other about that. If it can't be autodetected from the file itself,

Those differences affect only certain opcodes and in general are very hard to detect, or impossible in some cases; an automatic "guesser" would be possible, but if it guesses wrong, the user should be able to override it.

> but can be restricted to specific platforms, then the platform switch (see below) would be usable there; otherwise you would either need another subclass of Engine.

Which would be identical to the first subclass, except for a handful of opcodes :).

Not that it would be too difficult, at least for the disassembler i am not using the SimpleDisassembler, which has the nice advantage that I can put the opcode handling in a separate method, which is called from a loop in doDisassemble(); I can then overload that function in a subclass, handle only the "special" opcodes there, and otherwise invoke the parent class' implementation.

I guess it would be possible to modify SimpleDisassembler a bit to achieve a similar thing there, if one wanted to.

> You could add an engine-specific switch, but (and I admit I may be missing something here...) Boost.ProgramOptions does not appear to provide a good way of doing so, unless you're just going to send a string (or an array of strings) into the engine/disassembler/code generator and let it deal with them however it wants.

Hm, that's of course not that pretty either. But then having a "talkie" command line switch in a decompiler seems not so pretty to me, either -- no offense intended, I just mean that both feel quite hackish to me. Now I read that you also are thinking about adding a platform option.

Hrm. How about this alternative suggestion then: Every engine class can (optionally) provide a list of "variants" or "subengines". I.e. add a new method which returns a (possibly empty) list of strings. Such as "pc", "amiga", "talkie", "amiga-talkie", "pc-demo". The "-l" option would print next to each engine the list of subengines (each could have a custom description text). The user could *optionally* specify a subengine via a new command line flag.

This way, you can combine both the platform selector and the talkie bit, as well as cover the SCUMM cases I have in mind. Of course, this starts to get ugly if you have lots of independent flags that can occur in multiple combinations: If e.g. a game comes in 
 | {pc, amiga, mac} x {talkie, non-talkie} x {demo, non-demo} | = 12 variants, it would be nicer to have three separate flags, instead of 12 combined "subengine" names (also for the user, who would dislike having to remember if it was "demo-pc-talkie" or "talkie-pc-demo" or any of the other 4 possible orderings).

For SCUMM, this hypothetical issue does not exist. How about Kyra, I wonder?

> 
>> 
>> * Is there any particular reason why the Parameter type distinguishes
>> between 8, 16 and 32 bit values? Naively, it would seem kInt and kUInt
>> are sufficient for all needs. Mind you, I am not necessarily suggesting
>> to abolish those, I am just wondering whether it is important for a
>> disassembler to use those correctly -- i.e. does anything outside of my
>> own engine depend on these distinctions in any way? This is not clear
>> to me from the docs, and in the code I only see getUnsigned & getSigned
>> methods returning an int, so...
> 
> The only thing outside your engine that depends on the parameter type is the CFG output, and that's only to provide a special case for kString (to escape the string for dot output). They're pretty much there for completeness, in case you want to use that metadata somewhere in your engine-specific code. 

OK. Would be nice if the docs mentioned just that (either the LaTeX docs, or the doxygen comment of the class).

Keep up the good work!
Max