Recently I have talked about the project and everyone has been able to see progress in the new version of the XL Engine with GitHub. I have talked about work already done and even given some indications as to the amount of work remaining (i.e. all the stuff I should have done before the hiatus – but that’s another story). However I have mentioned working on “all games at the same time” and given a few details of the process – but I have left out many of the concrete details. To make things worse, progress on the game front is not visible since the game code is not in GitHub yet and honestly wouldn’t make much sense even if it was. As a result a lot of work gets done that no one sees yet. While I have my reasons for this decision, this and future posts will help fill in some of the missing details about the development process.
For this post I am going to talk about a specific part of the process – the process of going from disassembled code to C code that will be used by the XL Engine to replace the original executables. I make the following assumptions, which are true but will not go into detail until a future post: the games are fully disassembled and this code is stored in giant text files, the entry points for the 32 bit code are known for each and we limit ourselves to 32 bit code (16 bit code can be ignored for the most part). The engine functionality – such as file access, input, sound, graphics and so forth is assumed.
Originally the process was mostly manual. While this was valuable, allowed me to directly see the execution path, made it easy to see the current contents of memory and was very useful to see various patterns and understand the code. However there was a major problem – it just takes too long. While things moved faster over time since there are a finite set of patterns, even at the limit its still too slow. There are several options – reverse engineering the formats and building the functionality from observation and testing (like the original DarkXL/DaggerXL did and Interkarma is doing now) and/or decoding narrow areas of the code for tricky areas or building a tool set in order to help in the process of decompiling the entire executables and then replacing parts with Engine services (and later unified engine renderers). Since I decided to make these ports “source port” accurate, that really only leaves one good choice.
Why not use existing tools? Actually, for some things, I do. However automatic decompiling is a very difficult task and few tools are useful – I have tried many many of them. There are a few that are somewhat useful, but these tend to be very expensive. Fortunately, for my purposes, the scope is limited (DOS, specific types of games, 32 bit) – so I am better served by my own tools.
Currently the process consists of manual and tool assisted elements. I have talked about some of the manual processes in assumptions – finding the entry points and separating 32 bit and 16 bit code. I will talk about the rest of the manual processes later, now I’m going to talk about the tool assisted part. All tools discussed in this post are my own.
* Function Discovery Pass
* Function Grouping
* Function Name Generation
* Function Input Discovery
* Function Output Discovery
Local Function Passes
* Build Code Blocks: splits the function into code blocks. Roughly a code block represents a concept expressed as one or more instructions.
* Merge Code Blocks: apply merge passes until no more blocks can be merged. The idea is merge dependent blocks into larger blocks when possible.
* Process Code Blocks: update the local variable list, update the global variable list, use “hints” to help determine variable sizes and types, generate block code.
* Generate function code: function header, local variables, body code.
Global Code Generation
* Determine Global Variables used by each function group.
* Generate a header file for each group that define the following:
…. Global variables assigned to this group that are used by other groups.
…. Function headers for functions in this group used by other groups.
* Generate a source file for each group:
…. Write a reference to the group header file.
…. Write includes for external globals and functions from other groups.
…. Write global variables used only by this group (as “static”), include default values.
…. Write forward declarations for internal functions.
…. Write the code for each function.
The first task is to process the disassembled code, knowing the information discussed above. In addition there are various tables that are filled in manually over time – function names, variable names and similar data. The processing occurs over many automatic passes.
*The first pass is designed to discover functions, where the entry point address is, how long the function is – and must handle multiple return values. Starting from the entry point function, the tool traverses all possible execution paths in order to eliminate dead code. This will miss functions that are only referenced as function pointers, so this initial set of functions is written to a file that will be modified after additional processing (to add the missing functions).
*In the next pass, the tool groups functions based on locality – with the assumption that functions near each other in the executable are likely near each other in the original source (i.e. in the same object files and thus same source files). These groups may be refined later in the case that functions have debug code that indicate the source file name (something Daggerfall does in many functions). These groups will eventually become files, more on this later.
*Then a human readable name for the function is generated. If the function name is in the name table, meaning I have already assigned a name or know what it is, then that name is assigned. Otherwise a unique name is generated from the group and function index.
*Next the tool determines the function inputs (function arguments). It can determine if a register is used but not set or determine variables passed on the stack. Initially we don’t know the type of variable each input is, so we set it to “unknown” – which defaults to a 32 bit int (s32). Now when functions are called, we will be able to determine which arguments are required and which registers are used.
*Finally the function output / return value is determined. Usually return values, in these games anyway, are returned by register – so the tool detects registers that are set before return and used by the calling code. Returning values on the stack is also possible. Either way the output, if any, for each function is determined and recorded. Again the type is not yet known and defaults to 32 bit int.
Now we are ready to start the local function processing, which is its own series of passes per function.
Local Function Passes
Simply put, this process consists of two components – finding patterns and then mapping those patterns to code. The tool is responsible for finding the patterns and writing them out. I then take those patterns, figure out how to map them to C code and then the tool takes that data and does the grunt work of applying the mapping. This is an iterative process where each iteration improves both the results and the tool. There are a finite number of patterns, so work on any specific game helps them all overall.
Building Code Blocks
Blocks are generated by processing the “block end points”, where a block end point is one of the following:
* Value modification or assignment to a global or local variable.
* Function call
* Jump (jmp, ja, jz, jnz, etc).
* System call
* Label – if the preceeding instruction is NOT a block end point.
It should be obvious the code blocks cannot span jumps which account for, essentially, 2/3 of the list.
Once a block end point is reached – a new code block is generated but we need to figure out the dependencies. For functions, for example, this will be the function inputs. Since blocks are processed in order, we don’t have to worry about local or global variables. So for most blocks, dependencies are register assignments. To resolve a dependency, we must walk the code backwards until the register is assigned – all lines that contribute to resolving that dependency becomes a new dependent code block. That code block will oftentimes have dependencies of its own, so it too will have a dependent code block generated. This continues until all dependencies are resolved. Other blocks, unless they affect the dependencies, are skipped over during this process. If a dependency is modified by another block, then that block becomes a dependency and processing can be halted.
Blocks track the number of different blocks that depend on them – this will determine, later, if blocks are inlined or assigned to local variables. It also determines whether blocks can be merged later.
If a register is modified between two block end points and it is NOT a dependency, then it will be assigned as a local variable. In this case all local blocks are discarded and the Block processing is restarted for this function with the new information. In this way local variables that are assigned to registers – like loop counters – are handled properly.
Function outputs are also assigned to local variables so that dependencies in the future will depend on that variable rather than the function itself (we never call a function unless the original code explicitly did so).
Whenever a local or global variable is found and the instruction contains clues about it’s size – i.e. movsx; mov ax, var; etc. – modify the variable type attribute.
Merging Code Blocks
Code blocks can be merged if one is dependent ONLY on one other block. Neighboring code blocks are merged with the new block holding the dependency data for the parent block. Each pass tests the neighbors and merge passes continue until no additional merging occurred during the previous pass (this will usually result in 2 passes).
Processing Code Blocks
Once the blocks have been merged the functionality that they express must be discovered. First any local variables referenced are added to the local variable list for the function (if they do not already exist). Second any global variables are added to the global variable list. Any size and type hints are processed for the variables.
Finally the block extracts inputs – local variables, global variables, memory addresses and literals and replaces them with generic input markers (i0 through iN-1, where N = the number of inputs). Then a block key is generated from the resulting text. Finally the key is used to look up the block in a database of code blocks, adding it if it doesn’t exist. This database will be written to disk once processing is complete. If the entry exists it will contain the resulting C code with replacements for inputs. Memory addresses will be replaced by global variables. If the block does not have an entry then the C code is written out as “//Unknown entry #” where # is the block entry index but added to the database so that missing blocks can be added as they are hit.
For example, look at the following code block:
lea eax, [edx * 4]
lea ebx, [edx + eax]
shl ebx, 3
edx is a local variable in this case (call it local0 for example) and ebx is the output, internal registers are labled from t0 to tN-1 in order found, so this can be re-written as:
lea t0, [i0 * 4]
lea o0, [i0 + t0]
shl o0, 3
which has the following entry for C code:
i0 * 40
If only one block depends on this block, then whenever the block is required local0 * 40 will be substituted. If, only the other hand, more than one block depends on it then it will be assigned to a local variable which will be referenced instead:
s32 local1 = local0 * 40;
If the type of local0 is known then local1 will be given the same type – s32 is the default type.
I have not covered every step in detail but this should give you a much better idea of how the process is evolving. Clearly this isn’t the end of the story – a lot of the work is glossed over or not yet mentioned. But this post is long enough. I will continue to talk about the process in future posts.
If you want to help identify bugs that the XL Engine will need to fix, post them in the DOS Bugs section of the forums, each game has its own sub-forum. Don’t forget to read the sticky so you know what to post.