Difference between revisions of "JIT services"

From Nintendo Switch Brew
Jump to navigation Jump to search
Line 54: Line 54:
 
This does a bunch of validation. Then eventually CodeMemory/TransferMemory is mapped, the above symbol funcptr is called, runs more validation, and unmaps CodeMemory/TransferMemory. On success, this runs cache operations. Then this returns.
 
This does a bunch of validation. Then eventually CodeMemory/TransferMemory is mapped, the above symbol funcptr is called, runs more validation, and unmaps CodeMemory/TransferMemory. On success, this runs cache operations. Then this returns.
  
The funcptr is called with the following params: x0 = s32* out, x1 = {ptr to output [[#CodeRange]] initialized with the input [[#CodeRange]] and with the second u64 cleared}, x2 = {ptr to output [[#CodeRange]] initialized with the input [[#CodeRange]] and with the second u64 cleared}, x3 = {ptr to struct on stack which is the same as the one used for the "nnjitpluginOnPrepared" symbol, except +0x30/+0x38 is set to data from state}, x4 = cmd input u64, x5 = InBuffer addr, x6 = InBuffer size, x7 = {ptr to input [[#CodeRange]]}, sp0 = {ptr to input [[#CodeRange]]}, sp8 = {ptr to input [[#Struct32]]}, sp16 = cmd input u32, sp24 = OutBuffer addr, sp32 = OutBuffer size.
+
The funcptr is called with the following params: x0 = s32* out, x1 = {ptr to output [[#CodeRange]] initialized with the input [[#CodeRange]] and with the second u64 cleared}, x2 = {ptr to output [[#CodeRange]] initialized with the input [[#CodeRange]] and with the second u64 cleared}, x3 = {ptr to struct on stack which is the same as the one used for the "nnjitpluginOnPrepared" symbol, except +0x30/+0x38 are set to the sysmodule map-addr for each CodeMemory (addr for the second CodeMemory is set properly)}, x4 = cmd input u64, x5 = InBuffer addr, x6 = InBuffer size, x7 = {ptr to input [[#CodeRange]]}, sp0 = {ptr to input [[#CodeRange]]}, sp8 = {ptr to input [[#Struct32]]}, sp16 = cmd input u32, sp24 = OutBuffer addr, sp32 = OutBuffer size.
  
 
The input/output [[#CodeRange]] structs are validated as follows, where stateval is the first/second CodeMemory [[#CreateJitEnvironment|size]] for the first/second [[#CodeRange]]:
 
The input/output [[#CodeRange]] structs are validated as follows, where stateval is the first/second CodeMemory [[#CreateJitEnvironment|size]] for the first/second [[#CodeRange]]:

Revision as of 17:37, 1 October 2020

JIT is a sysmodule for run-time code generation (allowing for overlapping R-X and RW- views of memory). This was added to retail with [10.0.0+]. This was also supported in sdknso for a number of versions prior.

nnMain just initializes ro:1, then starts hosting the service from the main-thread with max_sessions=1 (threads are not created for service-hosting).

This is intended to only be used by Applications. The service-init in sdknso just uses PrepareForJit at the start, then gets the service.

sdknso CreateJitEnvironment implements the remaining initialization. After some validation, this uses svcCreateCodeMemory (can be called twice). Then #CreateJitEnvironment is used. TransferMemory with an user-specified buffer is created with permissions=None, which is then used with #LoadPlugin. When successful, this lastly uses #GetCodeAddress.

This loads the user-specified NRO into sysmodule-context ("DllPlugin"), and calls various symbols from that NRO. The code writing (in cmd GenerateCode) is done via symbol-calling, allowing the NRO to handle input_buffer->code translation+writing.

jit:u

This is "nn::jitsrv::IJitService".

Cmd Name
0 #CreateJitEnvironment

CreateJitEnvironment

Takes two input u64s for the CodeMemory sizes, 3 input handles, returns an #IJitEnvironment.

The first handle is a Process handle, the rest are CodeMemory handles.

This essentially does state/object init and maps the CodeMemory regions in the user-process. The permissions for the first CodeMemory is R-X, for the second CodeMemory it's R--. Both CodeMemory are mapped in the sysmodule with permissions RW-.

Both CodeMemory are intended to be optional, however really both are required.

This copies the user-process CodeMemory addr/size for each CodeMemory to elsewhere in state, however it uses the first CodeMemory for the second CodeMemory state init as well. This is the same state used by #GenerateCode and #GetCodeAddress.

IJitEnvironment

This is "nn::jitsrv::IJitEnvironment".

Cmd Name
0 #GenerateCode
1 #Control
1000 #LoadPlugin
1001 #GetCodeAddress

GenerateCode

Takes an u32, an u64, a #CodeRange, a #CodeRange, a #Struct32, a type-0x5 input buffer, a type-0x6 output buffer, and returns an output s32, a #CodeRange, a #CodeRange.

An error is thrown if the funcptr in state for the "nnjitpluginGenerateCode" symbol is not set.

This does a bunch of validation. Then eventually CodeMemory/TransferMemory is mapped, the above symbol funcptr is called, runs more validation, and unmaps CodeMemory/TransferMemory. On success, this runs cache operations. Then this returns.

The funcptr is called with the following params: x0 = s32* out, x1 = {ptr to output #CodeRange initialized with the input #CodeRange and with the second u64 cleared}, x2 = {ptr to output #CodeRange initialized with the input #CodeRange and with the second u64 cleared}, x3 = {ptr to struct on stack which is the same as the one used for the "nnjitpluginOnPrepared" symbol, except +0x30/+0x38 are set to the sysmodule map-addr for each CodeMemory (addr for the second CodeMemory is set properly)}, x4 = cmd input u64, x5 = InBuffer addr, x6 = InBuffer size, x7 = {ptr to input #CodeRange}, sp0 = {ptr to input #CodeRange}, sp8 = {ptr to input #Struct32}, sp16 = cmd input u32, sp24 = OutBuffer addr, sp32 = OutBuffer size.

The input/output #CodeRange structs are validated as follows, where stateval is the first/second CodeMemory size for the first/second #CodeRange:

  • CodeRange.offset must be 0x4-byte aligned.
  • CodeRange.offset must be <= stateval-CodeRange.size.
  • stateval must be >= CodeRange.size.
  • CodeRange.size must be <= ~CodeRange.offset.
  • CodeRange.size must be 0x4-byte aligned.

The output #CodeRange structs are validated the same way as the corresponding input #CodeRange structs, however in addition the output structs are validated against the input structs:

  • out_CodeRange.offset must be >= in_CodeRange.offset.
  • in_CodeRange.size must be >= out_CodeRange.size.
  • (out_CodeRange.offset-in_CodeRange.offset) must be <= (in_CodeRange.size-out_CodeRange.size).

Control

Takes an input u64, a type-0x5 input buffer, a type-0x6 output buffer, and returns an output s32.

An error is thrown if the funcptr in state for the "nnjitpluginControl" symbol is not set.

The TransferMemory is mapped, then the symbol funcptr is called: x0 = s32* out, x1 = {ptr to struct on stack which is the same as the one used for the "nnjitpluginOnPrepared" symbol}, x2 = {input u64 from the cmd}, x3/x4 = {cmd inbuffer addr/size}, x5/x6 = {cmd outbuffer addr/size}. Non-zero ret indicates error. On success the s32 from here is written to the cmd output s32. Afterwards, the TransferMemory is unmapped, then this returns.

LoadPlugin

Takes an input u64 tmem_size, a TransferMemory handle, two type-0x5 input buffers, no output.

The first buffer contains the NRR, the second buffer contains the NRO.

The tmem is temporarily mapped & cleared, when any errors this will also be done again. This always only mapped temporarily. This is referred to as "WorkMemory".

The input NRR is used with RegisterModuleInfo2, then the NRO is used with LoadModule (these are copied into another buffer with the required alignment). Afterwards, various symbol lookup is done with the loaded module:

  • "nnjitpluginGetVersion", error is handled on failure. This is called with no args, if the u32 output is >1 an error is thrown.
  • "nnjitpluginResolveBasicSymbols", this is optional. When successful and the funcptr is valid, this is called with x0 = {funcptr which can be called by the plugin for symbol-lookup. funcptr x0 = symbol_str*, ret = symbol_funcptr - this internally calls "nn::ro::LookupSymbol"}.
  • "nnjitpluginSetupDiagnostics", this is optional. When successful and the funcptr is valid, this is called with w0=1 and x1 = {ptr to a funcptr on stack, the func for this is a duplicate of the one referenced above}.
  • "nnjitpluginConfigure", error is handled on failure. When GetDebugModeFlag returns true, the symbol funcptr is called with x0 = {ptr where 2 output u32s are located}, and then the two output u32s are loaded (that data on stack is cleared prior to calling the funcptr). Otherwise when false, it's called with x0=0 and the fields which would contain the output u32s are cleared to 0. These fields are "nn::jit::MemorySecurityMode".
  • The symbol for "nnjitpluginControl" is loaded, with the funcptr copied into state. On success, the same is done with "nnjitpluginGenerateCode". If either of these fail, error handling will run.
  • TransferMemory init is done here. An ASLR'd address for the TransferMemory mapped-address is determined, which will then be reused for all later mappings.
  • CodeMemory init func-calling is done for both regions, where w1={first output from "nnjitpluginConfigure" above}. Likewise with the TransferMemory, with w1={second output from "nnjitpluginConfigure" above}.
  • "nnjitpluginOnPrepared", error is handled on failure. Before/after calling this symbol funcptr, the TransferMemory is mapped/unmapped. The symbol funcptr is called with x0 = {ptr to struct on stack}. The struct has following structure: +0 = 0x20-bytes of data from state {user-process map-addr/size for each CodeMemory, used by #GetCodeAddress}, +0x20 = TransferMemory map-addr, +0x28 = TransferMemory size, and +0x30 size 0x10-bytes is cleared.
  • Then this does cleanup and returns.

GetCodeAddress

No input, returns two output u64s which are loaded from state.

These u64s are the user-process map-addrs for each CodeMemory from state.

CodeRange

This is "nn::jit::CodeRange". This is a 0x10-byte struct. This is 8-byte aligned.

Offset Size Description
0x0 0x8 Offset
0x8 0x8 Size

Struct32

This is "nn::jitsrv::Struct32". This is a 0x20-byte struct. This is 8-byte aligned.