--- title: "Using an External Decoder" layout: default permalink: /page_external_decoder.html ---
DynamoRIO
|
The main benefit of switching from an internal decoder is maintenance: as new ISA extensions are added we lack the developer numbers to spend time adding decode/encode support to our own code.
Another desired benefit is ease of adding new architecture support: if we pick an external decoder that supports many architectures, we can get decode/encode support for free when porting.
Further benefits could include:
We would almost certainly keep DR’s IR which is intertwined in its client API. We would have to convert to and from an external IR which would add overhead. We would have to figure out:
OP_*
enumeration, especially when it may differ from the external decoder's opcodes: e.g., DR has OP_jnb_short
, OP_mov_st
, etc.INSTR_CREATE_*
macros: we would have to write a script or even do it manually.We would need these features from the external decoder:
XED could be considered on Intel, but if it takes significant effort, that effort may be better spent adding the features we need to a cross-architecture decoder.
While XED does seem to provide good support for x86 and all extensions, in particular AVX-512 including future extensions, it does not provide any other architectural support.
The LLVM decoder would give us the platform support we want, along with a way to implement assembly support, but it was not designed to be as lightweight as we’d like nor to be used separately from the compiler.
The LLVM decoder/encoder is tied to LLVM’s backend and MCInst. Every backend has its own implementation, with no generic abstraction: i.e., each backend has its own IR. There is thus no support to take advantage of multi-architecture support through IR abstractions. It might require a large project with community engagement to add this kind of support to LLVM.
One concern is missing opcodes: e.g., a decoder from Intel may not include every AMD opcode. Hidden non-public opcodes (such as int1, salc, and ffreep) may not be included. These may be deliberate omissions and adding them to the external decoder may not be accepted by the owners, forcing us to maintain our own extension. (Update: XED does include AMD opcodes and all the hidden opcodes we're aware of.)
Another concern is that the overhead, complexity, and especially time taken to connect an external decoder will never be amortized by enough future ISA extensions. Ignoring the addition of a new supported architecture and just considering x86, which is relevant to picking XED: sticking with DR’s existing decoder, we may only need to augment it once every 4 years or so. This augmentation typically takes just a few weeks. Hooking up one of these external decoders would be a large project likely taking months, while only saving those few weeks every N years. At that rate, it might take a decade or more recoup the investment. (Update: Adding AVX-512 was a much larger project than prior ISA extensions such as AVX-2 and was more on the order of months than weeks, which might change the calculus here.)