[Scummvm-cvs-logs] SF.net SVN: scummvm:[51944] tools/branches/gsoc2010-decompiler/decompiler/ doc

Mon Aug 9 21:41:22 CEST 2010

Revision: 51944
          http://scummvm.svn.sourceforge.net/scummvm/?rev=51944&view=rev
Author:   pidgeot
Date:     2010-08-09 19:41:21 +0000 (Mon, 09 Aug 2010)

Log Message:
-----------
DECOMPILER: Replace \verb in documentation

Modified Paths:
--------------
    tools/branches/gsoc2010-decompiler/decompiler/doc/cfg.tex
    tools/branches/gsoc2010-decompiler/decompiler/doc/codegen.tex
    tools/branches/gsoc2010-decompiler/decompiler/doc/disassembler.tex
    tools/branches/gsoc2010-decompiler/decompiler/doc/engine.tex
    tools/branches/gsoc2010-decompiler/decompiler/doc/overview.tex
    tools/branches/gsoc2010-decompiler/decompiler/doc/preamble.tex

Modified: tools/branches/gsoc2010-decompiler/decompiler/doc/cfg.tex
===================================================================

--- tools/branches/gsoc2010-decompiler/decompiler/doc/cfg.tex	2010-08-09 19:29:15 UTC (rev 51943)
+++ tools/branches/gsoc2010-decompiler/decompiler/doc/cfg.tex	2010-08-09 19:41:21 UTC (rev 51944)
@@ -12,19 +12,19 @@
 
 Calls to in-script functions are not represented with edges in the graph. This is done to keep functions separate from one another, so if your engine uses a jump as part of calling functions, you need to make sure you have given that particular jump the type kCall.
 
-The first step is handled in the constructor, while the next three steps are handled by the \verb+createGroups()+ method. The last step is handled by the \verb+analyze+ method.
+The first step is handled in the constructor, while the next three steps are handled by the \code{createGroups()} method. The last step is handled by the \code{analyze} method.
 
 \subsection{Function detection}
 \label{sec:autofunc}
-Prior to grouping, the control flow can be used to detect new functions. This detection is automatically activated if the Engine method \verb+detectMoreFuncs+ returns true.
+Prior to grouping, the control flow can be used to detect new functions. This detection is automatically activated if the Engine method \code{detectMoreFuncs} returns true.
 
 When function detection is enabled, unreachable blocks of code will be treated as functions, unless the presumed entry point is located within the range of another function. The end point of the function will then be the last instruction reachable from the entry point.
 
-Functions detected this way will be given the name \verb+auto_+. You can use this as a prefix to the actual name to signify that the function may not actually be a function, or you can ignore it and just replace it with the name you would normally use.
+Functions detected this way will be given the name \code{auto\_}. You can use this as a prefix to the actual name to signify that the function may not actually be a function, or you can ignore it and just replace it with the name you would normally use.
 
-You can also have the function detection determine the end point of an existing function. To do so, \verb+_endIt+ must be the same as \verb+_startIt+ for that function. In this case, only the end point will be changed within the function; the name will stay the same.
+You can also have the function detection determine the end point of an existing function. To do so, \code{\_endIt} must be the same as \code{\_startIt} for that function. In this case, only the end point will be changed within the function; the name will stay the same.
 
-Note that the control flow analysis has no way of determining an appropriate name, number of arguments, return value, or metadata. You will have to fill that in yourself using \verb+postCFG+ in the engine.
+Note that the control flow analysis has no way of determining an appropriate name, number of arguments, return value, or metadata. You will have to fill that in yourself using \code{postCFG} in the engine.
 
 If this step is not enabled, and no functions have been defined before the control flow analysis is started, there will still be added a single function covering the entire script. This is done to avoid having a special case in the code, and it will not affect the output of your script in any way.
 
@@ -55,15 +55,15 @@
 \subsection{Construct detection}
 After the groups have been created, we then analyze the groups to find the various kind of control flow constructs. The constructs are detected in multiple steps, with one construct per step, in the following order:
 \begin{itemize}
-\item \verb+do-while+
-\item \verb+while+
-\item \verb+break+
-\item \verb+continue+
-\item \verb+if+
-\item \verb+else+
+\item \code{do-while}
+\item \code{while}
+\item \code{break}
+\item \code{continue}
+\item \code{if}
+\item \code{else}
 \end{itemize}
 
-Each of these five constructs are marked using a \verb+GroupType+ member of the \verb+Group+ type, while \verb+else+ blocks are flagged using two booleans, \verb+_startElse+ and \verb+_endElse+. If \verb+_startElse+ is true, then an \verb+else+ block starts with this group, and should be output prior to the code in this group. If \verb+_endElse+ is true, an \verb+else+ block ends with this group, and the end should be output after the code in this group.
+Each of these five constructs are marked using a \code{GroupType} member of the \code{Group} type, while \code{else} blocks are flagged using two booleans, \code{\_startElse} and \code{\_endElse}. If \code{\_startElse} is true, then an \code{else} block starts with this group, and should be output prior to the code in this group. If \code{\_endElse} is true, an \code{else} block ends with this group, and the end should be output after the code in this group.
 
 Once a group has been flagged as being some construct, it is skipped for the other constructs.
 
@@ -73,24 +73,24 @@
 Group must end with a conditional jump (i.e., have two outgoing edges). Jump must go to an earlier place in the code.
 
 \paragraph{While detection}
-Group must end with conditional jump. Block must have an ingoing edge from some group later in the code, unless that edge comes from a do-while condition (in which case it is assumed to be an \verb+if+ instead).
+Group must end with conditional jump. Block must have an ingoing edge from some group later in the code, unless that edge comes from a do-while condition (in which case it is assumed to be an \code{if} instead).
 
 \paragraph{Break detection}
-Unconditional jump to some place later in the code. That place must either be the group immediately after a \verb+do-while+ condition, or the jump target of a \verb+while+ condition. Additionally, the jump is verified to go to the appropriate loop (so it does not exit multiple loops at once).
+Unconditional jump to some place later in the code. That place must either be the group immediately after a \code{do-while} condition, or the jump target of a \code{while} condition. Additionally, the jump is verified to go to the appropriate loop (so it does not exit multiple loops at once).
 
 \paragraph{Continue detection}
-Unconditional jump to a \verb+while+ or \verb+do-while+ condition, unless it is targeting a \verb+while+ condition which jumps to the next sequential group (in which case it is merely the end of the \verb+while+-loop). Just as with \verb+break+s, the jump is verified to go to the appropriate loop.
+Unconditional jump to a \code{while} or \code{do-while} condition, unless it is targeting a \code{while} condition which jumps to the next sequential group (in which case it is merely the end of the \code{while}-loop). Just as with \code{break}s, the jump is verified to go to the appropriate loop.
 
 \paragraph{If detection}
-All unflagged conditional jumps are flagged as \verb+if+s.
+All unflagged conditional jumps are flagged as \code{if}s.
 
 \paragraph{Else detection}
-All \verb+if+s are processed to see if they may have an \verb+else+ attached. If the jump target of an \verb+if+ is immediately preceded by an unconditional jump, which is neither a break or a continue, and that jump goes to later in the code, this signifies a possible \verb+else+ block, starting with the jump target of the if condition and ending with the group immediately before the target of the jump immediately before the jump target of the if condition. To avoid false positives, this block is then validated to not cross block boundaries. If the check passes, the \verb+else+ block data is added to the graph.
+All \code{if}s are processed to see if they may have an \code{else} attached. If the jump target of an \code{if} is immediately preceded by an unconditional jump, which is neither a break or a continue, and that jump goes to later in the code, this signifies a possible \code{else} block, starting with the jump target of the if condition and ending with the group immediately before the target of the jump immediately before the jump target of the if condition. To avoid false positives, this block is then validated to not cross block boundaries. If the check passes, the \code{else} block data is added to the graph.
 
 \subsection{Graph output}
-The graph can be output in DOT format by using the \verb+-g+ switch. In the graph, arrows on edges will be hollow if the edge is a jump, and filled if the edge represents the usual sequential order.
+The graph can be output in DOT format by using the \code{-g} switch. In the graph, arrows on edges will be hollow if the edge is a jump, and filled if the edge represents the usual sequential order.
 
 \subsection{Limitations}
-Currently, only unconditional jumps are supported for \verb+break+ and \verb+continue+; however, for code of the form \verb+if (...) break;+ or \verb+if (...) continue;+, it is a pretty straight-forward optimization to use the \verb+break+/\verb+continue+ jump as the conditional jump for the \verb+if+ condition check. Since \verb+if+s are found last, it should be possible to simply check unmarked conditional jumps as well and see if they meet the other criteria for a \verb+break+/\verb+continue+, although there might be some false positives for an if that stretches to the end of the loop it is placed in.
+Currently, only unconditional jumps are supported for \code{break} and \code{continue}; however, for code of the form \code{if (...) break;} or \code{if (...) continue;}, it is a pretty straight-forward optimization to use the \code{break}/\code{continue} jump as the conditional jump for the \code{if} condition check. Since \code{if}s are found last, it should be possible to simply check unmarked conditional jumps as well and see if they meet the other criteria for a \code{break}/\code{continue}, although there might be some false positives for an if that stretches to the end of the loop it is placed in.
 
-It is currently assumed that all conditional jumps in \verb+if+ condition checks go to a later place in the code. If optimized continue statements are used in a while (as described above), this will cause the analysis to be incorrect.
+t is currently assumed that all conditional jumps in \code{if} condition checks go to a later place in the code. If optimized continue statements are used in a while (as described above), this will cause the analysis to be incorrect.

Modified: tools/branches/gsoc2010-decompiler/decompiler/doc/codegen.tex
===================================================================
--- tools/branches/gsoc2010-decompiler/decompiler/doc/codegen.tex	2010-08-09 19:29:15 UTC (rev 51943)
+++ tools/branches/gsoc2010-decompiler/decompiler/doc/codegen.tex	2010-08-09 19:41:21 UTC (rev 51944)
@@ -12,29 +12,29 @@
 This process is repeated for each function.
 
 \subsection{Function signature}
-For each function in the script, the \verb+constructFuncSignature+ method is called. By default, this will return the empty string, and this will cause the code to be output "freely", i.e. without anything surrounding it. If a non-empty string is returned, a \verb+}+ will be added after all of the code in the function.
+For each function in the script, the \code{constructFuncSignature} method is called. By default, this will return the empty string, and this will cause the code to be output "freely", i.e. without anything surrounding it. If a non-empty string is returned, a \code{\}} will be added after all of the code in the function.
 
 If your engine uses methods, you will want to override this method to output your own signature.
 
-\emph{Note:} You must currently include a \verb+{+ at the end of your signature.
+\emph{Note:} You must currently include a \code{\{} at the end of your signature.
 
 \subsection{Group processing}
-During processing of a group, the instructions in the group are processed one at a time. Certain kinds of instructions can be handled by generic code, while others must be handled by engine-specific code in the \verb+processInst+ method of your subclass.
+During processing of a group, the instructions in the group are processed one at a time. Certain kinds of instructions can be handled by generic code, while others must be handled by engine-specific code in the \code{processInst} method of your subclass.
 
-If you need to access information about the group currently being processed, use the member variable \verb+_curGroup+.
+If you need to access information about the group currently being processed, use the member variable \code{\_curGroup}.
 
 \subsection{The stack and stack entries}
 When generating the code, a stack is used to represent the state of the system. When data is pushed on the stack, a stack entry describing how that data was created is added; when data is popped, a stack entry describing the popped data is removed.
 
-To manipulate the stack, use the \verb+push+ and \verb+pop+ methods to push or pop stack entries. Unlike the STL stack, \verb+pop+ returns the value being popped from the stack, so you do not have to first get the top element and then pop it afterwards, but you can still call the \verb+peek+ method if you just want to look at the topmost element without removing it. Additionally, it has an \verb+empty+ method to check if the stack is empty.
+To manipulate the stack, use the \code{push} and \code{pop} methods to push or pop stack entries. Unlike the STL stack, \code{pop} returns the value being popped from the stack, so you do not have to first get the top element and then pop it afterwards, but you can still call the \code{peek} method if you just want to look at the topmost element without removing it. Additionally, it has an \code{empty} method to check if the stack is empty.
 
-Some engines require you to look further down the stack than just the topmost element. You can use the \verb+peekPos+ method to retrieve an element at an arbitrary position in the stack. This method takes an integer containing the number of stack entries to skip, i.e. passing the value 0 will give you the topmost element, while passing the value 2 will give you the third value on the stack.
+Some engines require you to look further down the stack than just the topmost element. You can use the \code{peekPos} method to retrieve an element at an arbitrary position in the stack. This method takes an integer containing the number of stack entries to skip, i.e. passing the value 0 will give you the topmost element, while passing the value 2 will give you the third value on the stack.
 
-\emph{Note:} \verb+peekPos+ accesses the underlying STL container (\verb+std::deque+) using the \verb+at+ function, which will throw an exception if the stack does not contain enough elements.
+\emph{Note:} \code{peekPos} accesses the underlying STL container (\code{std::deque}) using the \code{at} function, which will throw an exception if the stack does not contain enough elements.
 
-When working with entries, you should use the \verb+EntryPtr+ type. This wraps the entry in a \verb+boost::intrusive_ptr+ to free the associated memory when it is no longer referenced.
+When working with entries, you should use the \code{EntryPtr} type. This wraps the entry in a \code{boost::intrusive\_ptr} to free the associated memory when it is no longer referenced.
 
-Some stack entries contain references to an arbitrary number of stack entries. This is handled using an STL \verb+deque+, typedef'ed as \verb+EntryList+.
+Some stack entries contain references to an arbitrary number of stack entries. This is handled using an STL \code{deque}, typedef'ed as \code{EntryList}.
 
 Stack entries can be categorized into 9 different types:
 
@@ -42,7 +42,7 @@
 Integers can use up to 32-bits, and be signed or unsigned. When creating an integer, you must specify its value and whether or not it is signed. This also contains additional methods to extract the value and signedness of the value, which may be of use in some situations.
 
 \paragraph{Variables (VarEntry)}
-Variables are stored as a simple string. Subclasses of \verb+CodeGenerator+ must implement their own logic to determine a suitable variable name when given a reference.
+Variables are stored as a simple string. Subclasses of \code{CodeGenerator} must implement their own logic to determine a suitable variable name when given a reference.
 
 \paragraph{Binary operations (BinaryOpEntry)}
 Binary operations stores the two stack entries used as operands, and a string containing the operator. Parenthesis are automatically added around all binary operations to preserve the proper evaluation order.
@@ -51,7 +51,7 @@
 Just like binary operations, except only a single operand is stored. Note: Currently, the operator will always be output on the left side of the operand.
 
 \paragraph{Duplicated entries (DupEntry)}
-Stores an index to distinguish between multiple duplicated entries. This index is automatically assigned and determined when calling the \verb+dup+ function to duplicate a stack entry.
+Stores an index to distinguish between multiple duplicated entries. This index is automatically assigned and determined when calling the \code{dup} function to duplicate a stack entry.
 
 \paragraph{Array entries (ArrayEntry)}
 Array entries are stored as a simple string containing the name of the array, and an EntryList of stack entries used as the indices, with the first element in the EntryList being output as the first index.
@@ -65,16 +65,16 @@
 \paragraph{Function calls (CallEntry)}
 Function calls have the same underlying storage types as an array entry, but the output is formatted like a function call instead of an array access.
 
-Each entry type knows how to output itself to an \verb+std::ostream+ supplied as a parameter to the \verb+print+ function, and the common base class \verb+StackEntry+ also overloads the \verb+<<+ operator so any stack entry can be streamed directly to an output stream using that function.
+Each entry type knows how to output itself to an \code{std::ostream} supplied as a parameter to the \code{print} function, and the common base class \code{StackEntry} also overloads the \code{<<} operator so any stack entry can be streamed directly to an output stream using that function.
 
 \subsection{Outputting code}
-When processing certain kinds of instructions, you will probably want to create a line of code as part of the output. To do that, call \verb+addOutputLine+ with a string containing the code you wish to output as an argument. This will then be associated with the group being processed.
+When processing certain kinds of instructions, you will probably want to create a line of code as part of the output. To do that, call \code{addOutputLine} with a string containing the code you wish to output as an argument. This will then be associated with the group being processed.
 
-If your line of code deals with control flow, you will probably want to do something about the indentation. You can supply two extra boolean arguments to \verb+addOutputLine+ to state that the indentation should be decreased before outputting this line, and/or that the indentation should be increased for lines output after this line. If you leave out these arguments, no extra indentation is added.
+If your line of code deals with control flow, you will probably want to do something about the indentation. You can supply two extra boolean arguments to \code{addOutputLine} to state that the indentation should be decreased before outputting this line, and/or that the indentation should be increased for lines output after this line. If you leave out these arguments, no extra indentation is added.
 
 Note: This indent handling is currently considered a temporary solution until there is time to implement something better. It may be replaced with a different form of indentation handling at a later time.
 
-You will usually need to output assignments at some point. For that, you can use the \verb+writeAssignment+ method to generate an assignment statement. \verb+writeAssignment+ takes two parameters, the first being the stack entry representing the left-hand side of the assignment operator, and the second being the stack entry representing the right-hand side of the operator.
+You will usually need to output assignments at some point. For that, you can use the \code{writeAssignment} method to generate an assignment statement. \code{writeAssignment} takes two parameters, the first being the stack entry representing the left-hand side of the assignment operator, and the second being the stack entry representing the right-hand side of the operator.
 
 \subsection{Default instruction handling and instruction metadata}
 When disassembling, you can store metadata for a given instruction to be used during code generation.
@@ -85,39 +85,39 @@
 The topmost stack entry is popped, and two duplicated copies are pushed to the stack. If the entry being duplicated was not already a duplicate, an assignment will be output to assign the original stack entry to a special dup variable, to show that the original entry is not being recalculated.
 
 \paragraph{kUnaryOp}
-The topmost stack entry is popped, and a \verb+UnaryOpEntry+ is created and pushed to the stack, using the codegen metadata as the operator, and the previously popped entry as the operand. Note: currently, a \verb+UnaryOpEntry+ only supports placing the operator on the left side of the operand.
+The topmost stack entry is popped, and a \code{UnaryOpEntry} is created and pushed to the stack, using the codegen metadata as the operator, and the previously popped entry as the operand. Note: currently, a \code{UnaryOpEntry} only supports placing the operator on the left side of the operand.
 
 \paragraph{kBinaryOp and kComparison}
-The two topmost stack entries are popped, and a BinaryOpEntry is created and pushed to the stack, using the codegen metadata as the operator and the previously popped entries as the operands. The order of the operands is determined by the value of the field \verb+_binOrder+, as described in Section~\vref{sec:argOrder}.
+The two topmost stack entries are popped, and a BinaryOpEntry is created and pushed to the stack, using the codegen metadata as the operator and the previously popped entries as the operands. The order of the operands is determined by the value of the field \code{\_binOrder}, as described in Section~\vref{sec:argOrder}.
 
 \paragraph{kCondJump and kCondJumpRel}
-The instruction is sent for processing in the engine-specific \verb+processInst+ method, so you can add any information provided by the specific opcode. The information on the stack is then read by the default code, and an if, while or do-while condition is output using the topmost stack entry.
+The instruction is sent for processing in the engine-specific \code{processInst} method, so you can add any information provided by the specific opcode. The information on the stack is then read by the default code, and an if, while or do-while condition is output using the topmost stack entry.
 
 \paragraph{kJump and kJumpRel}
 If the current group has been detected as a break or a continue, a break or continue statement is output. Otherwise, the jump is analyzed and output unless it is a jump back to the condition of a while-loop that ends there, or it is determined that the jump is unnecessary due to an else block following immediately after.
 
 \paragraph{kReturn}
-This simply adds a line \verb+return;+ to the output.
+This simply adds a line \code{return;} to the output.
 
-\emph{Note:} The default handling does not currently allow specifying a return value as part of the statement, as in \verb+return 0;+.
+\emph{Note:} The default handling does not currently allow specifying a return value as part of the statement, as in \code{return 0;}.
 
 \paragraph{kSpecial}
-The metadata is treated similar to parameter specifications in \verb+SimpleDisassembler+ (see Section~\vref{sec:simpledisasm}). If the specification string starts with the character \verb+r+, this signifies that the call returns a value, and processing starts at the next character.
-For each character in the metadata string, \verb+processSpecialMetadata+ is called with the instruction being processed, and the current metadata character to be handled. The default implementation only understands the character \verb+p+, which pops an argument from the stack and adds it to the argument list.
+The metadata is treated similar to parameter specifications in \code{SimpleDisassembler} (see Section~\vref{sec:simpledisasm}). If the specification string starts with the character \code{r}, this signifies that the call returns a value, and processing starts at the next character.
+For each character in the metadata string, \code{processSpecialMetadata} is called with the instruction being processed, and the current metadata character to be handled. The default implementation only understands the character \code{p}, which pops an argument from the stack and adds it to the argument list.
 Once the metadata string has been processed fully, then an entry representing the function call is pushed to the stack if the call returns a value. Otherwise, the call is added to the output.
 
-You can override the \verb+processSpecialMetadata+ method to add your own specification characters, just like you would override \verb+readParameter+ in \verb+SimpleDisassembler+. Use the \verb+addArg+ method to add arguments.
+You can override the \code{processSpecialMetadata} method to add your own specification characters, just like you would override \code{readParameter} in \code{SimpleDisassembler}. Use the \code{addArg} method to add arguments.
 
-Due to the conflict with the specification of a return value, it is recommended that you do not adopt \verb+r+ as a metadata character.
+Due to the conflict with the specification of a return value, it is recommended that you do not adopt \code{r} as a metadata character.
 
 \paragraph{Other types}
-No default handling exists for types other than those mentioned above. These instructions will be sent to the \verb+processInst+ method of your subclass, where you must handle them appropriately. This includes types like \verb+kLoad+ and \verb+kStore+.
+No default handling exists for types other than those mentioned above. These instructions will be sent to the \code{processInst} method of your subclass, where you must handle them appropriately. This includes types like \code{kLoad} and \code{kStore}.
 
-Note that this also includes \verb+kCall+. Although many engines might want to handle this in a manner similar to kSpecial opcodes, this is left to the engine-specific code so they can fully make sense of the metadata they choose to add to the function.
+Note that this also includes \code{kCall}. Although many engines might want to handle this in a manner similar to kSpecial opcodes, this is left to the engine-specific code so they can fully make sense of the metadata they choose to add to the function.
 
 \subsection{Order of arguments}
 \label{sec:argOrder}
-The generic handling of binary operators (kBinaryOp, kComparison) and magic functions (kSpecial) can be configured to display their arguments using FIFO or LIFO - respectively, the first and the last entry to be pushed onto the stack is used as the first (leftmost) argument. This is set as part of the constructor for the \verb+CodeGenerator+ class, using the enumeration values \verb+kFIFO+ and \verb+kLIFO+.
+The generic handling of binary operators (kBinaryOp, kComparison) and magic functions (kSpecial) can be configured to display their arguments using FIFO or LIFO - respectively, the first and the last entry to be pushed onto the stack is used as the first (leftmost) argument. This is set as part of the constructor for the \code{CodeGenerator} class, using the enumeration values \code{kFIFO} and \code{kLIFO}.
 
 To provide an example, consider the following sequence of instructions:
 
@@ -129,8 +129,8 @@
 \end{lstlisting}
 \end{bytecode}
 
-This can mean two different things, either \verb+a - b+ or \verb+b - a+, depending on the order in which the operands should be evaluated. For the former, choose FIFO ordering, for the latter, choose LIFO.
+This can mean two different things, either \code{a - b} or \code{b - a}, depending on the order in which the operands should be evaluated. For the former, choose FIFO ordering, for the latter, choose LIFO.
 
-For arguments to function calls, the same principle applies. You can use the \verb+addArg+ method to add an argument to the call currently being processed, using the chosen ordering.
+For arguments to function calls, the same principle applies. You can use the \code{addArg} method to add an argument to the call currently being processed, using the chosen ordering.
 
 In general, you might not know which ordering is more correct for function arguments; unless you have reason to believe otherwise, simply use the same ordering as for binary operators.

Modified: tools/branches/gsoc2010-decompiler/decompiler/doc/disassembler.tex
===================================================================
--- tools/branches/gsoc2010-decompiler/decompiler/doc/disassembler.tex	2010-08-09 19:29:15 UTC (rev 51943)
+++ tools/branches/gsoc2010-decompiler/decompiler/doc/disassembler.tex	2010-08-09 19:41:21 UTC (rev 51944)
@@ -3,7 +3,7 @@
 The purpose of the disassembler is to read instructions from a script file and convert them to a common, machine-readable form for further analysis.
 
 \subsection{Instructions}
-Instructions are represented using the \verb+Instruction+ struct.
+Instructions are represented using the \code{Instruction} struct.
 
 \begin{C++}
 \begin{lstlisting}
@@ -21,13 +21,13 @@
 
 Each member of this struct has a specific purpose:
 \begin{itemize}
-\item \verb+_opcode+ is used to store the numeric opcode associated with the instruction. This is not used by the decompiler itself, but is for your reference during later parts of the decompilation process. Note that this field is declared as a 32-bit integer; if you need more than 4 bytes for your opcodes, you will need to figure out which bytes you want to store if you want to use this field.
-\item \verb+_address+ stores the absolute memory address where this instruction would be loaded into memory.
-\item \verb+_stackChange+ stores the net change of executing this instruction - for example, if the instruction pushes a byte on to the stack, this should be set to 1. This is used to determine when each statement ends. The count can be in any unit you wish - bytes, words, bits - as long as the same unit is used for all instructions. This means that if your stack only works with 16-bit elements, pushing an 8-bit value and pushing a 16-bit value should have the same net effect on the stack.
-\item \verb+_name+ contains the name of the instruction. This is mainly for use during code generation.
-\item \verb+_type+ represent the type of instruction. See Section~\vref{sec:insttype} for details.
-\item \verb+_params+ contains the parameters given to the instruction - for example, if you have the instruction \verb+PUSH 1+, there would be one parameter, with the value of 1. See Section~\vref{sec:parameter} for details on the Parameter type.
-\item \verb+_codeGenData+ stores metadata to be used during code generation. For details, see Section~\vref{sec:codegen}.
+\item \code{\_opcode} is used to store the numeric opcode associated with the instruction. This is not used by the decompiler itself, but is for your reference during later parts of the decompilation process. Note that this field is declared as a 32-bit integer; if you need more than 4 bytes for your opcodes, you will need to figure out which bytes you want to store if you want to use this field.
+\item \code{\_address} stores the absolute memory address where this instruction would be loaded into memory.
+\item \code{\_stackChange} stores the net change of executing this instruction - for example, if the instruction pushes a byte on to the stack, this should be set to 1. This is used to determine when each statement ends. The count can be in any unit you wish - bytes, words, bits - as long as the same unit is used for all instructions. This means that if your stack only works with 16-bit elements, pushing an 8-bit value and pushing a 16-bit value should have the same net effect on the stack.
+\item \code{\_name} contains the name of the instruction. This is mainly for use during code generation.
+\item \code{\_type} represent the type of instruction. See Section~\vref{sec:insttype} for details.
+\item \code{\_params} contains the parameters given to the instruction - for example, if you have the instruction \code{PUSH 1}, there would be one parameter, with the value of 1. See Section~\vref{sec:parameter} for details on the Parameter type.
+\item \code{\_codeGenData} stores metadata to be used during code generation. For details, see Section~\vref{sec:codegen}.
 \end{itemize}
 
 If some instructions do not have a fixed effect on the stack--that is, the instruction name alone does not determine the effect on the stack--set the field to some easily recognizable value when doing the disassembly. You can then determine the correct value in a post-processing step after the code flow analysis.
@@ -38,22 +38,22 @@
 
 This is particularly important during code flow analysis; since this part is engine-independent, the analysis must have some way of distinguishing the different types of instructions. Additionally, this information can be used during code generation to generalize the recognition of constructs--for example, the code generated for addition and the code generated for multiplication will generally be identical, with the exception of that single arithmetic instruction doing the work.
 
-Most of the types are self-explanatory, with the possible exception of \verb+kSpecial+. \verb+kSpecial+ should be used for all "magic functions"--opcodes that perform some function specific to the engine, like playing a sound, drawing a graphic, or saving the game.
+Most of the types are self-explanatory, with the possible exception of \code{kSpecial}. \code{kSpecial} should be used for all "magic functions"--opcodes that perform some function specific to the engine, like playing a sound, drawing a graphic, or saving the game.
 
 \subsection{Parameters}
 \label{sec:parameter}
-Parameters are stored using a tagged union - one field (\verb+_type+) tells you which data type is being stored, and another field (\verb+_value+) stores the actual value.
+Parameters are stored using a tagged union - one field (\code{\_type}) tells you which data type is being stored, and another field (\code{\_value}) stores the actual value.
 
-Three convenience methods are provided to extract the value, \verb+getSigned+, \verb+getUnsigned+ and \verb+getString+. Please note: if an incorrect method is called, an exception is thrown.
+Three convenience methods are provided to extract the value, \code{getSigned}, \code{getUnsigned} and \code{getString}. Please note: if an incorrect method is called, an exception is thrown.
 
 Although there are only 3 get methods, there are 7 different parameter types. This additional distinction is intended for you to use as you see fit, in case it is useful as metadata somewhere in your engine-specific code.
 
-If you need to store different types than those already allowed, add the new type to the list of type parameters for the \verb+_value+ field and add another enumeration value to \verb+ParamType+. You should make the new type \emph{output streamable}--that is, allow it to be used like \verb+std::cout << value+. This allows the value to be output directly to an output stream regardless of its type.
+If you need to store different types than those already allowed, add the new type to the list of type parameters for the \code{\_value} field and add another enumeration value to \code{ParamType}. You should make the new type \emph{output streamable}--that is, allow it to be used like \code{std::cout << value}. This allows the value to be output directly to an output stream regardless of its type.
 
-Note: When storing 8 or 16-bit unsigned values in the \verb+_value+ field, cast them to an \verb+uint32+ when doing the assignment, or you will not be able to extract the value using \verb+getUnsigned+. This is a limitation caused by the automatic type conversion algorithm used by C++.
+Note: When storing 8 or 16-bit unsigned values in the \code{\_value} field, cast them to an \code{uint32} when doing the assignment, or you will not be able to extract the value using \code{getUnsigned}. This is a limitation caused by the automatic type conversion algorithm used by C++.
 
 \subsection{The Disassembler class}
-All disassemblers must inherit, directly or indirectly, from the \verb+Disassembler+ class. This is an abstract class providing an interface for disassemblers.
+All disassemblers must inherit, directly or indirectly, from the \code{Disassembler} class. This is an abstract class providing an interface for disassemblers.
 
 \begin{C++}
 \begin{lstlisting}
@@ -78,25 +78,25 @@
 \end{lstlisting}
 \end{C++}
 
-\verb+_f+ represents the file you will be reading from. The file is opened using the \verb+open+ function.
+\code{\_f} represents the file you will be reading from. The file is opened using the \code{open} function.
 
-\verb+_insts+ is an \verb+std::vector+ storing the instructions. Whenever you have read an instruction fully, add it here.
+\code{\_insts} is an \code{std::vector} storing the instructions. Whenever you have read an instruction fully, add it here.
 
-\verb+_addressBase+ is provided as a convenience if your engine does not consider the first instruction to be located at address 0. Assign the expected base address to this field, and make sure that the addresses you assign to the instructions are relative to this base address. This is mainly useful if your engine supports jumps or other references to absolute addresses in the script; if only relative addresses are used, the base address will not be relevant.
+\code{\_addressBase} is provided as a convenience if your engine does not consider the first instruction to be located at address 0. Assign the expected base address to this field, and make sure that the addresses you assign to the instructions are relative to this base address. This is mainly useful if your engine supports jumps or other references to absolute addresses in the script; if only relative addresses are used, the base address will not be relevant.
 
-\verb+_disassemblyDone+ is used to represent whether or not disassembly has already been performed. This is set automatically by \verb+disassemble+ and  \verb+dumpDisassembly+.
+\code{\_disassemblyDone} is used to represent whether or not disassembly has already been performed. This is set automatically by \code{disassemble} and  \code{dumpDisassembly}.
 
-\verb+doDisassemble+ is the method used to perform the actual disassembly, so this method must be implemented by all disassemblers.
+\code{doDisassemble} is the method used to perform the actual disassembly, so this method must be implemented by all disassemblers.
 
-\verb+disassemble+ simply calls the \verb+doDisassemble+ method to perform the disassembly (if necessary), and returns \verb+_insts+ to the calling method.
+\code{disassemble} simply calls the \code{doDisassemble} method to perform the disassembly (if necessary), and returns \code{\_insts} to the calling method.
 
-Finally, \verb+dumpDisassembly+ is used to output the instructions in a human-readable format to a file or stdout, performing a disassembly first if required, and then calls \verb+doDumpDisassembly+ to perform the actual output. A default implementation is provided for \verb+doDumpDisassembly+, but you can override it if the standard output format is not suitable for your particular engine.
+Finally, \code{dumpDisassembly} is used to output the instructions in a human-readable format to a file or stdout, performing a disassembly first if required, and then calls \code{doDumpDisassembly} to perform the actual output. A default implementation is provided for \code{doDumpDisassembly}, but you can override it if the standard output format is not suitable for your particular engine.
 
 \subsection{The SimpleDisassembler class}
 \label{sec:simpledisasm}
-To simplify the development of disassemblers, another base class is provided for instruction sets where instructions are of the format \verb+opcode [params]+, with opcode and parameters stored in distinct bytes.
+To simplify the development of disassemblers, another base class is provided for instruction sets where instructions are of the format \code{opcode [params]}, with opcode and parameters stored in distinct bytes.
 
-\verb+SimpleDisassembler+ defines a number of macros which you can use for writing your disassembler, and provides a framework for reading instruction parameters.
+\code{SimpleDisassembler} defines a number of macros which you can use for writing your disassembler, and provides a framework for reading instruction parameters.
 
 Following is a guide on how to implement a disassembler using this class as its base class. The instruction set used for this example is described in Table~\vref{tbl:simple_disasm_example}. While not a very useful instruction set, it covers many different aspects.
 
@@ -105,23 +105,23 @@
 \begin{tabular}{c | c | c}
 Instruction & Parameters & Description \\
 \hline
-\verb+PUSH+ (0x00) & uint8 & Pushes byte onto the stack.\\
-\verb+POP+ (0x01) & &  Pops a byte from the stack. \\
-\verb+PUSH2+ (0x02) & int16 & Pushes value onto the stack.\\
-\verb+POP2+ (0x03) & &  Pops two bytes from the stack. \\
-\verb+PRINT+ (0x80) & C string & Prints string to standard output. \\
-\verb+HALT+ (0xFF 0x00) & & Stops the machine.
+\code{PUSH} (0x00) & uint8 & Pushes byte onto the stack.\\
+\code{POP} (0x01) & &  Pops a byte from the stack. \\
+\code{PUSH2} (0x02) & int16 & Pushes value onto the stack.\\
+\code{POP2} (0x03) & &  Pops two bytes from the stack. \\
+\code{PRINT} (0x80) & C string & Prints string to standard output. \\
+\code{HALT} (0xFF 0x00) & & Stops the machine.
 \end{tabular}
 \caption{Instruction set used in the SimpleDisassembler example.}
 \label{tbl:simple_disasm_example}
 \end{table}
 
-For the purpose of this example, our instruction set will use little-endian values, and uses byte elements for the stack (so \verb+POP+ changes the stack pointer by 1 and \verb+POP2+ changes it by 2).
+For the purpose of this example, our instruction set will use little-endian values, and uses byte elements for the stack (so \code{POP} changes the stack pointer by 1 and \code{POP2} changes it by 2).
 
 \subsubsection{Opcode recognition}
-The first thing to do in the \verb+doDisassemble+ method is to read past any header which may be present in your script file. We will assume that our bytecode files do not have a header.
+The first thing to do in the \code{doDisassemble} method is to read past any header which may be present in your script file. We will assume that our bytecode files do not have a header.
 
-You must place your opcodes between two macros, \verb+START_OPCODES+ and \verb+END_OPCODES+. These two macros define the looping required to read one byte at a time.
+You must place your opcodes between two macros, \code{START\_OPCODES} and \code{END\_OPCODES}. These two macros define the looping required to read one byte at a time.
 
 \begin{C++}
 \begin{lstlisting}
@@ -130,7 +130,7 @@
 \end{lstlisting}
 \end{C++}
 
-To define an opcode, use the \verb+OPCODE+ macro. This macro takes 5 parameters: the opcode value, the name of the instruction, the type of instruction, the net effect on the stack, and a string describing the parameters that are part of the instruction. We will start by implementing the \verb+POP+ and \verb+POP2+ opcodes:
+To define an opcode, use the \code{OPCODE} macro. This macro takes 5 parameters: the opcode value, the name of the instruction, the type of instruction, the net effect on the stack, and a string describing the parameters that are part of the instruction. We will start by implementing the \code{POP} and \code{POP2} opcodes:
 
 \begin{C++}
 \begin{lstlisting}
@@ -141,10 +141,10 @@
 \end{lstlisting}
 \end{C++}
 
-The \verb+OPCODE+ macro automatically stores the full opcode in the \verb+_opcode+ field of the generated \verb+Instruction+.
+The \code{OPCODE} macro automatically stores the full opcode in the \code{\_opcode} field of the generated \code{Instruction}.
 
 \subsubsection{Parameter reading}
-\verb+PUSH+, \verb+PUSH2+ and \verb+PRINT+ all take parameters as part of the instruction. To read these, you must specify them as part of the parameter string, using one character per parameter. The types understood by default are specified in Table~\vref{tbl:paramtypes}.
+\code{PUSH}, \code{PUSH2} and \code{PRINT} all take parameters as part of the instruction. To read these, you must specify them as part of the parameter string, using one character per parameter. The types understood by default are specified in Table~\vref{tbl:paramtypes}.
 
 \begin{table}[!hpbt]
 \centering
@@ -168,7 +168,7 @@
 
 To help you remember these meanings, little-endian values are encoded using lower case ("small letters", i.e. little), while big-endian values are encoded using upper case ("big" letters). The exception here is a single byte, since endianness has no effect for individual bytes. Here, the mnemonic is that an unsigned byte ("B") has a larger maximum value. For the other letters, "s" was used because it is the first letter in "short", which is usually a 16-bit signed value in C. Similarly, "i" is short for "int". "w" and "d" come from the terms "word" and "dword", which are terms for 16-bit and 32-bit unsigned types on an x86 platform.
 
-Note that strings are not supported by default. To add reading of a string type, you can override the \verb+readParameter+ function to add your own types:
+Note that strings are not supported by default. To add reading of a string type, you can override the \code{readParameter} function to add your own types:
 
 \begin{C++}
 \begin{lstlisting}
@@ -180,7 +180,7 @@
 		std::stringstream s;
 		while ((cmd = _f.readByte()) != 0) {
 			s << cmd;
-			_address++;
+			_address}};
 		}
 		s << '"';
 		p->_type = kString;
@@ -194,7 +194,7 @@
 \end{lstlisting}
 \end{C++}
 
-Note that you will have to increment the \verb+_address+ variable manually when you read a byte. This variable is used to determine the address of the instruction, and must be kept in sync with your progress reading the file.
+Note that you will have to increment the \code{\_address} variable manually when you read a byte. This variable is used to determine the address of the instruction, and must be kept in sync with your progress reading the file.
 
 Now, we can add all three opcodes to the list:
 
@@ -211,11 +211,11 @@
 \end{C++}
 
 \subsubsection{Multi-byte opcodes}
-There is only one opcode left to add, \verb+HALT+. This one is a bit trickier, because it uses multiple bytes for the opcode - and the \verb+OPCODE+ macro only works for one byte at a time.
+There is only one opcode left to add, \code{HALT}. This one is a bit trickier, because it uses multiple bytes for the opcode - and the \code{OPCODE} macro only works for one byte at a time.
 
-To solve this, you can define \emph{subopcodes}. By defining 0xFF as the start of a multi-byte opcode, we can then specify 0x00 as representing a \verb+HALT+ instruction when it follows 0xFF.
+To solve this, you can define \emph{subopcodes}. By defining 0xFF as the start of a multi-byte opcode, we can then specify 0x00 as representing a \code{HALT} instruction when it follows 0xFF.
 
-Defining 0xFF is easily done using the \verb+START_SUBOPCODE+ macro. After that, specify the opcodes for this following byte, and finish the subopcode declarations with \verb+END_SUBOPCODE+.
+Defining 0xFF is easily done using the \code{START\_SUBOPCODE} macro. After that, specify the opcodes for this following byte, and finish the subopcode declarations with \code{END\_SUBOPCODE}.
 
 \begin{C++}
 \begin{lstlisting}
@@ -232,21 +232,21 @@
 \end{lstlisting}
 \end{C++}
 
-Subopcodes can be nested if the instruction set requires it. For subopcodes, the \verb+_opcode+ field stores the bytes in the order they appear in the file - i.e., the HALT instruction would have the opcode value 0xFF00. If the opcodes are longer than 4 bytes, only the last 4 bytes will be stored.
+Subopcodes can be nested if the instruction set requires it. For subopcodes, the \code{\_opcode} field stores the bytes in the order they appear in the file - i.e., the HALT instruction would have the opcode value 0xFF00. If the opcodes are longer than 4 bytes, only the last 4 bytes will be stored.
 
-If all opcodes in a group of subopcodes share a prefix, you can use the \verb+START_SUBOPCODE_WITH_PREFIX+ macro instead of \verb+START_SUBOPCODE+. This macro takes an additional string parameter containing the full prefix to use for the opcodes associated with this subopcode. The prefix is not propagated if you nest subopcodes, only the nearest prefix is used.
+If all opcodes in a group of subopcodes share a prefix, you can use the \code{START\_SUBOPCODE\_WITH\_PREFIX} macro instead of \code{START\_SUBOPCODE}. This macro takes an additional string parameter containing the full prefix to use for the opcodes associated with this subopcode. The prefix is not propagated if you nest subopcodes, only the nearest prefix is used.
 
 \subsubsection{Code generation metadata}
-For each opcode, you will need to replicate its semantics during code generation. To assist you in generalizing your code, you can use the \verb+OPCODE_MD+ macro to add metadata to the instruction, which is then available during code generation.
+For each opcode, you will need to replicate its semantics during code generation. To assist you in generalizing your code, you can use the \code{OPCODE\_MD} macro to add metadata to the instruction, which is then available during code generation.
 
 For example, if you have an opcode for addition, you can store the addition operator as a string in the metadata field, and have that put to use during code generation to avoid having to check the opcode for each instruction of that type.
 
-The arguments for the \verb+OPCODE_MD+ are the same as those for \verb+OPCODE+, but with an extra parameter at the end for the metadata.
+The arguments for the \code{OPCODE\_MD} are the same as those for \code{OPCODE}, but with an extra parameter at the end for the metadata.
 
 \begin{C++}
 \begin{lstlisting}
 START_OPCODES;
-	OPCODE_MD(0x14, "add", kBinaryOp, -1, "", "+");
+	OPCODE_MD(0x14, "add", kBinaryOp, -1, "", "}");
 END_OPCODES;
 \end{lstlisting}
 \end{C++}
@@ -256,7 +256,7 @@
 \subsubsection{Advanced opcode handling}
 If you have one or two opcodes that do not quite fit into the framework provided, you can define your own specialized handling for these opcodes.
 
-Instead of using the \verb+OPCODE+ macro, put your code between \verb+OPCODE_BASE+ and \verb+OPCODE_END+. For example, if your opcode has the value 0x40, you would use this:
+Instead of using the \code{OPCODE} macro, put your code between \code{OPCODE\_BASE} and \code{OPCODE\_END}. For example, if your opcode has the value 0x40, you would use this:
 
 \begin{C++}
 \begin{lstlisting}
@@ -266,6 +266,6 @@
 \end{lstlisting}
 \end{C++}
 
-\verb+OPCODE_BASE+ automatically keeps track of the current opcode value. You can access \verb+full_opcode+ to get the current full opcode. Alternatively, you can use the \verb+OPCODE_BODY+ macro to use the standard behavior for opcodes, and then follow that with the additional code you want. The \verb+OPCODE_BODY+ macro takes the same arguments as the \verb+OPCODE_MD+ macro.
+\code{OPCODE\_BASE} automatically keeps track of the current opcode value. You can access \code{full\_opcode} to get the current full opcode. Alternatively, you can use the \code{OPCODE\_BODY} macro to use the standard behavior for opcodes, and then follow that with the additional code you want. The \code{OPCODE\_BODY} macro takes the same arguments as the \code{OPCODE\_MD} macro.
 
-For your convenience, a few additional macros are available: \verb+ADD_INST+, which adds an empty instruction to the vector, and \verb+LAST_INST+ which retrieves the last instruction in the vector. Additionally, you can use \verb+INC_ADDR+ as a shorthand for incrementing the address variable by 1, but note that you should \emph{not} increment the address for the opcode itself - this is handled by the other macros.
+For your convenience, a few additional macros are available: \code{ADD\_INST}, which adds an empty instruction to the vector, and \code{LAST\_INST} which retrieves the last instruction in the vector. Additionally, you can use \code{INC\_ADDR} as a shorthand for incrementing the address variable by 1, but note that you should \emph{not} increment the address for the opcode itself - this is handled by the other macros.

Modified: tools/branches/gsoc2010-decompiler/decompiler/doc/engine.tex
===================================================================
--- tools/branches/gsoc2010-decompiler/decompiler/doc/engine.tex	2010-08-09 19:29:15 UTC (rev 51943)
+++ tools/branches/gsoc2010-decompiler/decompiler/doc/engine.tex	2010-08-09 19:41:21 UTC (rev 51944)
@@ -1,27 +1,27 @@
 \section{Engine}
-The \verb+Engine+ class represent a single engine. It works as a factory for the engine-specific classes required for each step of the process.
+The \code{Engine} class represent a single engine. It works as a factory for the engine-specific classes required for each step of the process.
 
 As a minimum, engines must provide a disassembler and a code generator. All other steps are optional, but you can implement them for additional processing.
 
 If you need to store metadata about the script, you can add the necessary fields to your engine class and store the information there, as the same instance will be used throughout the decompilation process.
 
 \subsection{Adding a new engine}
-In order to make the decompiler use the code you write to decompile code for some engine, it must be registered in the program. To do so, include the header file for your engine in \verb+decompiler.cpp+, and use the \verb+ENGINE+ macro defined there to register your engine with the program.
+In order to make the decompiler use the code you write to decompile code for some engine, it must be registered in the program. To do so, include the header file for your engine in \code{decompiler.cpp}, and use the \code{ENGINE} macro defined there to register your engine with the program.
 
-This macro takes 3 parameters: the engine ID, a description of the engine, and the name of the \verb+Engine+ subclass used to create the classes used for the various steps of the process. The ID is entered by the user to signify the engine where the script originates from, and the description is a descriptive text which will be shown when the user requests a list of the supported engines. In general, you should place the files for your engine in a folder with the same name as the engine ID you use.
+This macro takes 3 parameters: the engine ID, a description of the engine, and the name of the \code{Engine} subclass used to create the classes used for the various steps of the process. The ID is entered by the user to signify the engine where the script originates from, and the description is a descriptive text which will be shown when the user requests a list of the supported engines. In general, you should place the files for your engine in a folder with the same name as the engine ID you use.
 
-The methods you need to implement in your \verb+Engine+ subclass are:
+The methods you need to implement in your \code{Engine} subclass are:
 \begin{itemize}
-\item \verb+getDisassembler+, which takes a reference to the instruction vector to use for storage and creates a disassembler object and returns it. For more on disassemblers, see Section~\vref{sec:disassembler}.
-\item \verb+getCodeGenerator+, which takes a reference to the \verb+std::ostream+ to output the code to and creates a code generator object and returns it. For more on code generators, see Section~\vref{sec:codegen}.
-\item \verb+getDestAddress+, which takes a const iterator to a jump instruction as a parameter and returns the address the instruction will jump to if the jump is taken. Unless you do differently in your engine-specific code, this function will only receive jumps as input, so if you can take a shortcut based on that, you are allowed to do that.
+\item \code{getDisassembler}, which takes a reference to the instruction vector to use for storage and creates a disassembler object and returns it. For more on disassemblers, see Section~\vref{sec:disassembler}.
+\item \code{getCodeGenerator}, which takes a reference to the \code{std::ostream} to output the code to and creates a code generator object and returns it. For more on code generators, see Section~\vref{sec:codegen}.
+\item \code{getDestAddress}, which takes a const iterator to a jump instruction as a parameter and returns the address the instruction will jump to if the jump is taken. Unless you do differently in your engine-specific code, this function will only receive jumps as input, so if you can take a shortcut based on that, you are allowed to do that.
 \end{itemize}
 
 Additional methods you can override are:
 \begin{itemize}
-\item \verb+supportsCodeFlow+ and \verb+supportsCodeGen+, which can be used to stop the decompiler from going any further after disassembly or code flow analysis, respectively. This is helpful when working on a brand new engine, so you can take one step at a time without having to remember to use the right command-line switch. If you do not override these methods, the decompiler will go through all steps.
-\item \verb+detectMoreFuncs+ allows you to tell the control flow analysis to automatically detect functions based on reachability. See Section~\vref{sec:autofunc} for details. By default, this is turned off; engines must opt-in to this feature.
-\item \verb+postCFG+, which is a post-processing step called after control flow analysis. If you override \verb+detectMoreFuncs+ to return true, you must also override this function to process any newly found functions. A default implementation which does nothing is already provided in case you do not need to do any post-processing.
+\item \code{supportsCodeFlow} and \code{supportsCodeGen}, which can be used to stop the decompiler from going any further after disassembly or code flow analysis, respectively. This is helpful when working on a brand new engine, so you can take one step at a time without having to remember to use the right command-line switch. If you do not override these methods, the decompiler will go through all steps.
+\item \code{detectMoreFuncs} allows you to tell the control flow analysis to automatically detect functions based on reachability. See Section~\vref{sec:autofunc} for details. By default, this is turned off; engines must opt-in to this feature.
+\item \code{postCFG}, which is a post-processing step called after control flow analysis. If you override \code{detectMoreFuncs} to return true, you must also override this function to process any newly found functions. A default implementation which does nothing is already provided in case you do not need to do any post-processing.
 \end{itemize}
 
 It is important to realize that you do not necessarily need to implement a completely new code generator and disassembler for every engine; for variations on the same engine, you can reuse the existing classes and simply send in any extra information required. In particular, code generators are likely to be reusable without change for different versions of the same engine - e.g., the Kyra2 code generator will likely work for all Kyra games.
@@ -29,7 +29,7 @@
 \subsection{Game information}
 For some engines, it may not be enough to know the engine; some instructions may differ in behavior between different games or variants of a game, for example between talkie or non-talkie versions, or between versions for different platforms.
 
-The \verb+Engine+ class contains a field \verb+_isTalkie+ which is set to true if the user passed in the \verb+-t+ switch on the command line. You can check this flag in your engine-specific code if necessary.
+The \code{Engine} class contains a field \code{\_isTalkie} which is set to true if the user passed in the \code{-t} switch on the command line. You can check this flag in your engine-specific code if necessary.
 
 In the interest of user friendliness, if the necessary data exists directly in the script file itself, you should use that instead of requiring additional switches to be passed.
 
@@ -38,7 +38,7 @@
 \subsection{Functions}
 Some engines allow multiple functions in a single script file. Each function must be analyzed separately, but in order to do that, it is of course necessary to know where the functions start and end, and when it is time to actually generate some code, you will want to know a bit about the function as well.
 
-This information is stored in the engine, as a \verb+std::map+ of \verb+Function+s, in the field \verb+ _functions+.
+This information is stored in the engine, as a \code{std::map} of \code{Function}s, in the field \code{\_functions}.
 
 \begin{C++}
 \begin{lstlisting}
@@ -57,13 +57,13 @@
 
 Each member of this struct has a specific purpose:
 \begin{itemize}
-\item \verb+_startIt+ is a const iterator pointing to the first instruction in the function, i.e. the entry point.
-\item \verb+_endIt+ is a const iterator pointing to the instruction immediately after the last instruction, similar to \verb+end()+ on STL collections.
-\item \verb+_name+ is the name of the functions.
-\item \verb+_v+ is the GraphVertex containing the entry point. This will be automatically assigned in the control flow analysis.
-\item \verb+_args+ is the number of arguments for the function. This is present as a convenience; usually, you will not know the names of the arguments in the function, so you can store the number of them here and use this during code generation to generate a method signature containing some default parameter names.
-\item \verb+_retVal+ should be true if your method returns a value, and false if it does not. Again, this is for your own convenience, to make it easier to handle calls.
-\item \verb+_metadata+ contains metadata about the function, so you know how to handle the arguments when the function is being called. It is up to you how you want to use this.
+\item \code{\_startIt} is a const iterator pointing to the first instruction in the function, i.e. the entry point.
+\item \code{\_endIt} is a const iterator pointing to the instruction immediately after the last instruction, similar to \code{end()} on STL collections.
+\item \code{\_name} is the name of the functions.
+\item \code{\_v} is the GraphVertex containing the entry point. This will be automatically assigned in the control flow analysis.
+\item \code{\_args} is the number of arguments for the function. This is present as a convenience; usually, you will not know the names of the arguments in the function, so you can store the number of them here and use this during code generation to generate a method signature containing some default parameter names.
+\item \code{\_retVal} should be true if your method returns a value, and false if it does not. Again, this is for your own convenience, to make it easier to handle calls.
+\item \code{\_metadata} contains metadata about the function, so you know how to handle the arguments when the function is being called. It is up to you how you want to use this.
 \end{itemize}
 
 When you add a function to the map, you must use the address of the first instruction as the key.

Modified: tools/branches/gsoc2010-decompiler/decompiler/doc/overview.tex
===================================================================
--- tools/branches/gsoc2010-decompiler/decompiler/doc/overview.tex	2010-08-09 19:29:15 UTC (rev 51943)
+++ tools/branches/gsoc2010-decompiler/decompiler/doc/overview.tex	2010-08-09 19:41:21 UTC (rev 51944)
@@ -10,7 +10,7 @@
 Of these steps, the code flow analysis is engine-independent, while disassembly and code generation require engine-specific code.
 
 \subsection{Reading guide}
-Names used in code are written in a \verb+monospaced typewriter font+.
+Names used in code are written in a \code{monospaced typewriter font}.
 
 Actual code snippets have basic syntax highlighting on a light gray background, and lines are numbered, like below:
 

Modified: tools/branches/gsoc2010-decompiler/decompiler/doc/preamble.tex
===================================================================
--- tools/branches/gsoc2010-decompiler/decompiler/doc/preamble.tex	2010-08-09 19:29:15 UTC (rev 51943)
+++ tools/branches/gsoc2010-decompiler/decompiler/doc/preamble.tex	2010-08-09 19:41:21 UTC (rev 51944)
@@ -123,3 +123,5 @@
 \newcommand\B{\rule[-1.4ex]{0pt}{0pt}}
 
 %\setlength{\parindent}{0mm}
+
+\newcommand{\code}[1]{\texttt{#1}}


This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.