Switch System Flaws

Revision as of 01:03, 16 April 2019 by SciresM (talk | contribs) (fix copy-paste error on disclosure date)

Exploits are used to execute unofficial code (homebrew) on the Nintendo Switch. This page is a list of publicly known Switch system flaws.

For userland applications/applets flaws see here.

System flaws

Hardware

Flaws in this category pertain to the underlying hardware that powers the Switch.

This includes components shared across Tegra based devices such as the TSEC, the Security Engine, the GPU and so on.

Summary Description Fixed with hardware model/revision Newest hardware model/revision this flaw was checked for Timeframe this was discovered Public disclosure timeframe Discovered by
CVE-2018-6242 (leveraged by the ShofEL2 and Fusée Gelée exploits) The USB software stack provided inside the boot instruction rom (IROM/bootROM) contains a copy operation whose length can be controlled by an attacker. By carefully constructing a USB control request, an attacker can leverage this vulnerability to copy the contents of an attacker-controlled buffer over the active execution stack, gaining control of the Boot and Power Management processor (BPMP) before any lock-outs or privilege reductions occur. This execution can then be used to exfiltrate secrets and to load arbitrary code onto the main CPU Complex (CCPLEX) "application processors" at the highest possible level of privilege (typically as the TrustZone Secure Monitor at PL3/EL3). Unknown (Tegra186 and Tegra214) HAC-001 (Tegra210) January 2018 April 23, 2018 shuffle2 and fail0verflow (originally),
ktemkin and ReSwitched Team (independently),
naehrwert (independently),
hexkyz (independently),
st4rk with Shiny Quagsire and Dazzozo (independently),
and many others (independently).
GMMU DMA attack The Switch's GPU includes a separate MMU (GMMU) that is allowed to bypass the system's IOMMU (SMMU). By accessing the GPU's MMIO region and manipulating the page table entries in the GMMU, an attacker can read/write any portion of the DRAM (except memory carveouts).

[5.0.0+] Works around this hardware flaw by using memory pool partitioning. You can no longer escalate into sysmodules with GPU DMA because all their memory is allocated using heap that's carved out.

None HAC-001 (Tegra210) Summer 2017 December 28, 2017 hexkyz, SciresM and qlutoo
Weak Security Engine context validation The Tegra X1 supports a "deep sleep" feature, where everything but DRAM and the PMC registers lose their content (and the SoC loses power). Upon awaking, the bootrom re-executes, restoring system state. Among these stored states is the Security Engine's saved state, which uses AES-128-CBC with a random key and all-zeroes IV. However, the bootrom doesn't perform a MAC on this data, and only validates the last block. This allows one to control most of security engine's state upon wakeup, if one has a way to modify the encrypted state buffer.

With a way to modify the encrypted state buffer, one can thus dump keys from "write-only" keyslots, etc.

This also bypasses the SBK protection of the bootROM: indeed, at warmboot, bootROM will always clear keyslot 0xE to prevent malicious code from saving the SBK. Moving the SBK to another keyslot in the saved context renders this protection moot.

None HAC-001 (Tegra210) December 2017 January 20, 2018 SciresM and motezazer
Security Engine keyslots vulnerable to partial overwrite attack

The Tegra X1 security engine supports writing keyslot data to the engine with syntax as follows:

SECURITY_ENGINE->AES_KEYTABLE_ADDR = (keyslot << 4) | (dword_index_in_keyslot);

SECURITY_ENGINE->AES_KEYTABLE_DATA = readle32(key, dword_index_in_keyslot * 4);

However, the Security Engine flushes writes to the internal key tables immediately when AES_KEYTABLE_DATA is written -- this allows one to overwrite a single dword of a key at a time, and thus brute force the contents of keyslots in time (2^32 * 8) = 2^35 instead of 2^256.

None HAC-001 (Tegra210) Theorized Summer 2017 due to suggestive syntax, confirmed April 9, 2018 April 9, 2018 SciresM, almost surely others (independently).
Poor validation of bootrom SDRAM configuration parameters leads to arbitrary writes in bootrom

The Tegra X1 bootrom supports saving SDRAM parameters to scratch registers, and using the saved configuration to enable DRAM during warmboot.

The code that parses these parameters does if (params->EmcBctSpareN) *params->EmcBctSpareN = params->EmcBctSpareNPlusOne for most N, without validating either the address or value written to it. There are other arbitrary writes in this code, as well (e.g. BootromPatch parameters intended for patching MISC registers do not check a relative offset to 0x7000000, etc).

This allows a user with access to the PMC registers (via pre-sleep bpmp execution, or otherwise) to gain arbitrary bootrom code execution.

None HAC-001 (Tegra210) 2017 December 16, 2018 Everyone (independently).

Software

Bootloader

Flaws in this category pertain to any bootloader component such as the package1ldr, the NX bootloader or the warmboot binary.

Summary Description Successful exploitation result Fixed in system version Last system version this flaw was checked for Timeframe this was discovered Public disclosure timeframe Discovered by
Null-dereference in panic() The Switch's stage 1 bootloader, on panic(), clears the stack and then attempts to clear the Security Engine. However, it does so by dereferencing a pointer to the SE in .bss (initially NULL), and this pointer doesn't get initialized until partway into the bootloader's main() after several functions that might panic() are called. Thus, a panic() caused prior to SE initialization would result in the SE pointer still being NULL when dereferenced.

The BPMP doesn't have an active MPU and the bus won't data abort on an invalid address, so no exception will be entered: it'll end up overwriting some exception vectors with NULL before halting.

In 3.0.0, this was fixed by moving the security engine initialization earlier in main(), before the first function that could potentially panic().

Some exception vectors overwritten with NULL, before SBK/other keyslots are cleared. Probably useless for anything more interesting. 3.0.0 3.0.0 Early July, 2017 July 30, 2017 Everyone who diff'd 2.3.0 and 3.0.0 Package1
FUSE_DIS_PGM not written by package1 The switch's hardware fuse driver contains a write-once bit in a register called "FUSE_DIS_PGM", which disables burning fuses until the next reboot. While Nintendo's bootloader code for waking up from sleep writes this on all firmware, the actual package1 initial bootloader forgets to write to it on cold reboot.

This isn't too big of a problem because another fuse is burnt on retail devices (production mode), which prevents burning *all* fuses other than ODM_RESERVED ones in hardware.

This was fixed in 3.0.0 by writing to the register on cold boot (although the write happens in TZ instead of package1 where it should take place, possibly to obfuscate the fact that they made this mistake).

Burning arbitrary ODM reserved fuses with TZ code execution, which should never be possible for non-bootloader code.

Warning: one could irreparably brick one's console by playing with this.

3.0.0 3.0.0 Late summer/early fall 2017 December 31, 2017 SciresM, motezazer
TSEC firmware compromises itself Package1ldr loads a firmware blob into TSEC early on boot. This piece of code runs on the TSEC in Authenticated Mode and has the sole purpose of generating the per-console TSEC key (see Cryptosystem).

As a way to mitigate attacks, the TSEC firmware blob is split into 3 stages: Stage 0 which is unencrypted and unsigned, Stage 1 which is unencrypted but signed and Stage 2 which is encrypted and signed. Stage 0 loads a static pre-generated signature into the Falcon's CPU crypto registers, loads Stage 1 into the Falcon's CODE region and jumps to it. Execution will proceed into Stage 1 in Authenticated Mode if, and only if, the loaded signature matches the one Falcon calculates internally for Stage 1.

Among various things, Stage 1 will attempt to do a "backwards" security check by calculating a CMAC over Stage 0 and comparing it with a known hash stored in the TSEC firmware's key data (a small buffer stored after Stage 0's code). If the hashes don't match, execution aborts.

Stage 1 stores the calculated Stage 0's CMAC in the stack, but forgets to clear it. Since the stack is located in Falcon's DATA region, loading the TSEC firmware blob and dumping the DATA region afterwards (via MMIO) will reveal the calculated hash. This allows using Stage 1 as an oracle to generate a valid CMAC for arbitrary Stage 0 code. Replacing the CMAC in the TSEC firmware's key data region results in Stage 1 accepting any Stage 0 code, thus rendering this security measure useless.

Additionally, since signed Falcon code can't be revoked without an hardware revision, an attacker can always reuse the flawed Stage 1 code even if a fix is issued.

Running TSEC firmware's Stage 1 in a user controlled environment. Mostly useless, but may aid in side-channel attacks. None 5.0.2 January 2018 April 29, 2018 hexkyz, probably others (independently).
pk1ldrhax Package1ldr decrypts and verifies the keyblob inside of the current BCT in order to get the package1 key, and then uses the package1 key to decrypt package1. It then validates package1 before jumping to it by checking the PK11 magic number, and that the section sizes sum to the expected size (and are individually less than the expected size).

However, package1ldr does not actually validate the package1 key against a fixed vector (much like kernel9loader forgot to do so on the 3ds). This would normally not matter, as keyblobs are validated -- however, with bootrom code execution one can dump SBK and forge keyblobs, and thus control the package1 key.

Thus (in theory, but not in practice due to the size of the brute force required) one can replace the package1 key with garbage, causing package1 to decrypt into garbage, and hope that this garbage passes validation checks and that package1ldr jumping into the garbage will do something useful.

This was fixed incidentally in 6.2.0, as pk1ldr does not use keyblob data to decrypt package1 any more.

With a large enough brute force: arbitrary package1 code execution from coldboot.

However, a usable brute force is on the order of >= ~2^80, so this is almost certainly not actually usable in any meaningful context.

6.2.0 6.2.0 Early 2017 (as soon as plaintext package1ldr was first dumped) November 20, 2018 Everyone

TrustZone

Flaws in this category pertain exclusively to the Secure Monitor.

Summary Description Successful exploitation result Fixed in system version Last system version this flaw was checked for Timeframe this was discovered Public disclosure timeframe Discovered by
Non-atomic mutexes When an SMC is called, TrustZone sets a global variable to mark that an SMC is in progress, so that two SMCs using shared resources (like the security engine) do not trample on one another. On 1.0.0, this global variable was written using non-atomic writes, and thus a race condition is possible.

However, the SMC handler enforces that all SMCs must be called from core #3, unless the top-level handler ID is 1 (SMCs internal to the kernel). Thus, the only SMCs that can be run side-by-side are [any userland smc] and smcGetRandomBytesForKernel, and this turns out to not really be abusable.

Mostly useless. Maybe some oob-write into unused (and thus useless) memory if running smcGetRandomBytesForKernel and smcGetRandomBytesForUser at the same time. 2.0.0 2.0.0 December 2017 (Probably earlier by others) January 18, 2018 SciresM, probably others (independently).
jamais vu (non-secure world access to PMC MMIO and pre-deep sleep firmware) On 1.0.0, one could map in the PMC registers in userland. In addition, am ran a little-kernel based firmware on the BPMP at runtime. With code execution under am, one could modify the BPMP's little-kernel firmware to hook deep sleep entry, and modify TrustZone/Security engine state.

This was fixed in 2.0.0 by making the PMC secure-world only, blacklisting the BPMP's exception vectors from being mapped, and thoroughly checking for malicious behavior on deep sleep entry.

Arbitrary TrustZone code execution. 2.0.0 2.0.0 December, 2017 January 20, 2018 SciresM and motezazer
Missed BPMP Exception Vector Writes Starting in 2.0.0, the BPMP is asleep at runtime, and is turned on by TrustZone during smcCpuSuspend in order to initiate the deep sleep process. When it does so, it is held in RESET, and TrustZone attempts to write to the BPMP exception vectors at 0x6000F200 to register EVP_RESET = lp0_entry_fw_crt0, and all other EVPs to a function that simply reboots. However, while they successfully write EVP_RESET, they miss all the other vectors, accidentally writing to the 0x6000F004-0x6000F020 region instead of the 0x6000F204-0x6000F220 region they want to write to. This results in all the exception vectors for the BPMP other than RESET being "undefined" (attacker controlled).

With some way of causing an exception vector to be taken at the right time, this would give pre-sleep code execution (and thus arbitrary TrustZone code execution, via the security engine flaw). However, none of the abort vectors are really triggerable, and interrupts are disabled for the BPMP when it is taken out of reset. Thus, this is useless in practice.

This was fixed in 4.0.0 by writing to the correct registers.

Theoretically: Arbitrary TrustZone code execution. In practice: Useless. 4.0.0 4.0.0 January, 2018 February 23, 2018 SciresM and motezazer, naehrwert, hexkyz, probably others (independently).
TSEC has access to the secure kernel carveout TrustZone is responsible for managing security carveouts to prevent DMA controllers from accessing the carveout which contains the kernel, sysmodules, and other critical operating system data.

Until 8.0.0, the list of devices that could access the carveout included the TSEC. However, the TSEC can bypass the SMMU when in authenticated mode by writing to a certain register. Thus, pwning nvservices would allow one to take over the TSEC, and use it to write to normally protected mmio/memory.

In 8.0.0, this was fixed by removing TSEC access, and adding TSECB access (TSECB cannot bypass the SMMU).

With access to the TSEC mmio (nvservices ROP) and code execution in TSEC Heavy Secure mode, kernel code execution, probably. 8.0.0 8.0.0 2017 (when TrustZone code plaintext was first obtained). April 15, 2018 Everyone
deja vu (insufficient system state validation on suspend leads to pre-sleep BPMP code execution) Jamais Vu was fixed in 2.0.0 by making the PMC secure-world only, blacklisting the BPMP's exception vectors from being mapped, and thoroughly checking for malicious behavior on deep sleep entry, since gaining pre-sleep code execution on the BPMP compromises the system.

However, the state validation performed by Nintendo's Secure Monitor was insufficient to prevent pre-sleep execution from being obtained.

Prior to 6.0.0, one could use a DMA controller that had access to IRAM and was not held in reset (there were multiple) to race TrustZone's writes to the BPMP firmware in IRAM, and thus overwrite Nintendo's firmware with an attacker's to gain pre-sleep code execution.

6.0.0 addressed this by performing TrustZone state MAC writes and locking PMC scratch *before* turning on the BPMP, fixing the original Jamais Vu exploit entirely. In addition, the BPMP firmware in TrustZone's .rodata is now memcmp'd to the actual data after it is written to IRAM. This mitigates race attacks that modify the firmware.

However, Nintendo both forgot to validate the BPMP exception vectors after writing them, and forgot to hold in reset a DMA controller that can write to the BPMP's exception vectors.

AHB-DMA is not blacklisted by kernel mapping whitelist (Nintendo probably forgot it, because the TX1 TRM does not really document that it's present, although the MMIO works as documented in older (Tegra 3 and before) TRMs).

Thus, with kernel code execution (or some other way of accessing AHB-DMA, e.g. nspwn on <= 4.1.0, TSEC hax, or other arbitrary mmio access flaws), one can DMA to the BPMP's exception vectors as they are written, causing TrustZone to start the BPMP executing an attacker's firmware at a different location than TrustZone intends/validates.

This was fixed in 8.0.0 by blocking AHB-DMA arbitration, and thus there are no more devices that can write to the relevant MMIO at the right time.

Arbitrary TrustZone/BootROM code execution, by using either the original Jamais Vu flaw (prior to 6.0.0 or a warmboot bootrom exploit (any firmware where pre-sleep execution can be gained). 8.0.0 8.0.0 December 2017 April 15, 2018 SciresM, motezazer and ktemkin, naehrwert (independently), almost certainly others (independently)

Kernel

Flaws in this category pertain exclusively to the HorizonOS Kernel.

Summary Description Successful exploitation result Fixed in system version Last system version this flaw was checked for Timeframe this was discovered Public disclosure timeframe Discovered by
Syscall Infoleaks Many syscalls leaked kernel pointers on sad paths (for example svcSetHeapSize and svcQueryMemory), until they landed a bunch of fixes in 2.0.0. Nothing really. 2.0.0 2.0.0 ?
svcWaitSynchronization/svcReplyAndReceive bad cleanup on error If there is a page fault when fetching handles from the userspace array, it cleans up by dereferencing all objects despite having only loaded first N. Allows the attacker to make arbitrary decrefs on any kernel synchronization object, and thus can be used to get UAF. Haven't actually been tried on real HW though, but should work (tm). Kernel code execution 2.0.0 2.0.0 24 April qlutoo
Bad irq_id check in CreateInterruptEvent CreateInterruptEvent syscall is designed to work only for irq_id >= 32. All irq_ids < 32 are "per-core" and reserved for kernel use (watchdog/scheduling/core communications).

On 1.0.0 you could supply irq_id < 32 and it would write outside the SharedIrqs table.

You can register irq's in the Core3Irqs table, and thus register per-core irqs for core3, that are normally reserved for kernel. Useless. 2.0.0 2.0.0 ~October 17 October qlutoo
Kernel .text mapped executable in usermode Prior to 3.0.2 the kernel .text was mapped in usermode as executable. This can be used for usermode ROP for bypassing ASLR, but SVCs/IPC are not usable by running kernel .text in usermode. Executing kernel .text in usermode 3.0.2 3.0.2 34c3 (December 28, 2017) qlutoo
Memory Controller not properly secured The Switch OS originally had the memory controller not set to be accessible only by the secure-world, which was problematic because insecure access can compromise the kernel.

This was fixed partially in 2.0.0 by blacklisting the memory controller from being mapped by user-processes, and was fixed entirely in 4.0.0 by making the memory controller TZ-only and making all kernel accesses go through smcReadWriteRegister.

With some way to access the memory controller MMIO, arbitrary kernel code execution. 4.0.0 4.0.0 January 2018 January 2018 SciresM, yellows8
Potential svcWaitForAddress thread use-after-free Between 4.0.0, where svcWaitForAddress was introduced, and 7.0.0, there was a second intrusive rbtree node in KThread for the WaitForAddress tree (the key being (address, priority), sorted lexicographically). Unlike the WaitProcessWideKeyAtomic tree, the kernel forgot to reinsert the WaitForAddress node when the thread's priority changed (priority inheritance and/or SetPriority), breaking the rbtree invariants; and since the kernel walks through the entire tree to remove intrusive nodes, you could cause threads to stay in the tree even after their deletion.

7.0.0 fixed the issue by using the same intrusive node for both trees. The thread/node knows which tree it is in, and the latter is correctly updated when thread priority changes.

It unluckily didn't look exploitable 7.0.0 7.0.0 July 2018 February 2019 TuxSH

FIRM-package System Modules

Flaws in this category pertain to any of the built-in system modules.

Summary Description Successful exploitation result Fixed in system version Last system version this flaw was checked for Timeframe this was discovered Public disclosure timeframe Discovered by
Service access control bypass (sm:h, smhax, probably other names) Prior to 3.0.1, the service manager (sm) built-in system module treats a user as though it has full permissions if the user creates a new "sm:" port session but bypasses initialization. This is due to the other sm commands skipping the service ACL check for Pids <= 7 (i.e. all kernel bundled modules) and that skipping the initialization command leaves the Pid field uninitialized.

In 3.0.1, sm returns error code 0x415 if Initialize has not been called yet.

Acquiring, registering, and unregistering arbitrary services 3.0.1 3.0.1 May 2017 August 17, 2017 Everyone
Overly permissive SPL service The concept behind the switch's Secure Monitor is that all cryptographic keydata is located in userspace, but stored as "access keys" encrypted with "keks" that never leave TrustZone. The spl ("security processor liaison"?) service serves as an interface between the rest of the system and the secure monitor. Prior to 4.0.0, spl exposed only a single service "spl:", which provided all TrustZone wrapper functions to all sysmodules with access to it. Thus anyone with access to the spl: service (via smhax or by pwning a sysmodule with access) could do crypto with any access keys they knew.

This was fixed in 4.0.0 by splitting spl: into spl:, spl:mig, spl:ssl, spl:es, and spl:fs.

Arbitrary spl: crypto with any access keys one knows. For example, one could use the SSL module's access keys to decrypt their console's SSL certificate private key without having to pwn the SSL sysmodule. 4.0.0 4.0.0 Summer 2017 (after smhax was discovered). December 23, 2017 Everyone
Single session services not really single session Several "critical" services (like fsp-ldr, fsp-pr, sm:m, etc) are meant to only ever hold a single session with a specific sysmodule. However, when a sysmodule dies, all its service session handles are released -- and thus killing the holder of a single session handle would allow one (via sm:hax etc) to get access to that service.

This was fixed in 4.0.0 by adding a semaphore to these critical single-session services, so that even if one gets access to them an error code will be returned when attempting to use any of their commands.

With some way to access these services and kill their session holders (like expLDR): dumping sysmodule code, arbitrary service access, elevated filesystem permissions, etc. 4.0.0 4.0.0 May/June 2017 (basically immediately after smhax was discovered) December 30, 2017 Everyone
nspwn fsp-ldr command 0 "MountCode" takes in a Content Path (retrieved from NCM by Loader), and returns an IFileSystem for the resulting ExeFS. These content paths, are normally NCAs, but MountCode also supports a number of other formats, including ".nsp" -- which is just a PFS0.

When a path ending in ".nsp" is parsed by MountCode, the PFS0 is treated as a raw ExeFS. Because there is no NCA header, the ACID signatures are not validated -- and because there are no other signatures in a PFS0, this results in no signature checking happening at all.

The actual .nsp handling is eventually done by {content mounting function} called by MountCode and other FS commands.

Thus, by placing an ExeFS (NSOs + "main.npdm") and setting one's desired title ID to "@Sdcard:/some_title.nsp" or "@User:/some_title.nsp" etc one can launch arbitrary unsigned code, with arbitrary unsigned NPDMs.

This appears to have been fixed by only allowing .nsp when the input fstype==7 for the internal content-mounting function, returning 0x2EE202 otherwise.

With access to "lr": Arbitrary code execution with full system privileges. 5.0.0 5.0.0 Late 2017 April 23, 2018 Everyone
Single null-byte stack overflow in Loader ContentPath parsing Previously, loader content path parsing looked like this, where path_from_lr was up to 0x300 bytes and not necessarily null-terminated:
 char nca_path[0x300] = {0};
 strcat(nca_path, path_from_lr);
 for (int i = 0; nca_path[i]; i++) {
     if (nca_path[i] == '\\') { nca_path[i] = '/'); }
 }

Thus, a content path of the maximum length (0x300 bytes) would result in strcat writing a NULL terminator past the end of the nca_path buffer.

This was fixed in 6.0.0, the new code looks like this:

 char nca_path[0x300];
 strncpy(nca_path, path_from_lr, sizeof(nca_path));
 for (int i = 0; i  < sizeof(nca_path) && nca_path[i]; i++) {
     if (nca_path[i] == '\\') { nca_path[i] = '/'); }
 }


With access to "lr": single null-byte stack overflow in Loader. Maybe (but probably not) loader code execution. 6.0.0 6.0.0 September 2, 2018 September 19, 2018 SciresM

System Modules

Flaws in this category pertain to any non-built-in system module.

Summary Description Successful exploitation result Fixed in system version Last system version this flaw was checked for Timeframe this was discovered Public disclosure timeframe Discovered by
Out-of-bounds array read for BCAT_Content_Container secret-data index The BCAT_Content_Container secret-data index is not validated at all. This is handled before the RSA-signature(?) is ever used. Since the field is an u8, a total of 0x800-bytes relative to the array start can be accessed.

This is not useful since the string loaded from this array is only involved with key-generation.

Unknown 2.0.0 August 4, 2017 August 6, 2017 Shiny Quagsire, yellows8 (independently)
OOB Read in NS system module (pl:utoohax, pl:utonium, maybe other names) Prior to 3.0.0, pl:u (Shared Font services implemented in the NS sysmodule) service commands 1,2,3 took in a signed 32-bit index and returned that index of an array but did not check that index at all. This allowed for an arbitrary read within a 34-bit range (33-bit signed) from NS .bss. In 3.0.0, sending out of range indexes causes error code 0x60A to be returned. Dumping full NS .text, .rodata and .data, infoleak, etc 3.0.0 3.0.0 April 2017 On exploit's fix in 3.0.0 qlutoo, ReSwitched Team (independently)
Unchecked domain ID in common IPC code Prior to 2.0.0, object IDs in domain messages are not bounds checked. This out-of-bounds read could be exploited to brute-force ASLR and get PC control in some services that support domain messages. 2.0.0 2.0.0 ~July 2017 20 July 2017‎ hthh
expLDR (sysmodule handle table exhaustion) Most sysmodules share common template code to handle IPC control messages. The command DuplicateSession (type 5 command 2)'s template code will abort() if it fails to duplicate a session's handle for the requester. Because many sysmodules have limited handle table size (smaller than the browser/other entrypoints), repeatedly requesting to duplicate one's session will cause the sysmodule to run out of handle table space and abort, causing the service to release all its handles cleanly. Sysmodule crashes. Most usefully, crashing ldr allows access to fsp-ldr and crashing pm allows access to fsp-pr. Useless after 4.0.0, which mitigated a number of single-session service access issues. Unfixed 4.1.0 24 June 2017 8 March 2018 daeken
Transfer Memory leak in nvservices system module The nvservices sysmodule does not clear most of its transfer memory prior to release. The calling process can read key bits of memory, including breaking ASLR (by revealing the image base) and exposing the address of other transfer memory to set up attacks. More details here: transfermeme (nvservices info leak) by daeken 6.0.0 6.0.0 June 2017 16 October 2018 qlutoo and hexkyz,

daeken (independently)

OOB write in audio system module Prior to 2.0.0, the AppendAudioOutBuffer and AppendAudioInBuffer IPC commands would blindly increment the appended buffers' count while using said count value as an index to where the user data should be copied into. This resulted in an 0x28 bytes, user controlled, out-of-bounds memory write into the audio sysmodule's memory space.

Combined with the GetReleasedAudioOutBuffer or GetReleasedAudioInBuffer commands, this could also be used as an 8 byte infoleak.

In 2.0.0, the commands now return error code 0x1099 if the number of unreleased buffers exceeds 0x1F.

Code execution under audio sysmodule 2.0.0 2.0.0 November 2, 2018 hexkyz, probably others (independently).
nvhax (memory corruption in nvservices system module) Prior to 6.2.0, the nvservices ioctl NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE would take a single "pwarpstate" argument which would be interpreted by nvservices as a memory pointer for writing 2 "warpstate" structs (one for each Streaming Multiprocessor).

This resulted in nvservices attempting to blindly memcpy into this user supplied address and trigger a crash. However, if paired with an infoleak, this could be used to arbitrarily write 0x30 bytes anywhere in nvservices' memory space. Additionally, the "warpstate" struct itself was never initialized, which means nvservices would leak the 0x30 bytes from the stack. By invoking other ioctls it was also possible to partially control the stack contents and achieve a usable arbitrary memory write primitive.

In 6.2.0, NVGPU_GPU_IOCTL_WAIT_FOR_PAUSE now takes 2 inline "warpstate" structs instead of a "pwarpstate" pointer, thus effectively avoiding the bad memcpy.

Code execution under nvservices sysmodule 6.2.0 6.2.0 April 5, 2017 November 24, 2018 hexkyz
Infoleak in nvservices system module The nvservices ioctl NVMAP_IOC_ALLOC takes an optional argument "addr" which allows the calling process to pass a pointer to user allocated memory for backing a nvmap object. If "addr" is left as 0, nvservices uses the transfer memory region (donated by the user during initialization) instead, when allocating memory for the nvmap object.

By design, freeing the nvmap object by calling the ioctl NVMAP_IOC_FREE returns, in its "refcount" argument, the user address previously supplied if the reference count reaches 0. However, prior to 6.2.0, the case where the transfer memory region is used to allocate the nvmap object was not taken into account, thus resulting in NVMAP_IOC_FREE leaking back an address from within the transfer memory region mapped in nvservices' memory space.

In 6.2.0, NVMAP_IOC_FREE no longer returns the address when the transfer memory region is used instead of user supplied memory.

Combined with other vulnerabilities: Defeating ASLR in nvservices sysmodule. 6.2.0 6.2.0 April 2017 November 24, 2018 Everyone