Changes

3,234 bytes added ,  07:38, 18 April 2020
Line 65: Line 65:     
====Kernel====
 
====Kernel====
 +
* Kernel crt0 was heavily refactored.
 +
** Core 0 init vs Core 1/2/3 init are now separate functions.
 +
** The initial arguments are now stored inside the Core Local regions before those regions are initialized.
 +
*** This saves a little memory by allowing for reusing that space.
 +
** The initial arguments now store an entrypoint invocation function pointer in addition to the entrypoint.
 +
** Core 1/2/3 now panic if cpuactlr/cpuectlr hold a value different than the one in init argument. Previously, they they did if (real value != expected value) { real value = expected value }.
 +
* Physical ASLR for certain backing regions (Kernel .text/.rodata/.rwdata/.bss + the Slab Heap region) was implemented.
 +
** Physical randomization of the kernel image is done by KernelLdr.
 +
** Randomization of the slab heap region is done by kernel during init.
 +
** To accommodate this, the virtual/physical memory trees no longer track pair blocks for the kernel/slab heap regions (as they no longer correlate directly).
 
* The global rng is now std::mt19937_64 instead of std::mt19937
 
* The global rng is now std::mt19937_64 instead of std::mt19937
 +
* KPageHeap bitmaps now store a small TinyMT rng.
 +
** This is used to allocate random pages from the bitmap instead of first-available. Thus, KPageHeap allocation order is now random/non-deterministic.
 +
* KSpinLock was changed. Previously it used two u16s, each aligned to cache line. Now it packs the u16s into a single non-cache-line aligned u32.
 +
** The new spin lock is identical to the implementation in the ARM Reference Manual.
 +
** KScheduler's spin lock still uses the old cache-line aligned u16s.
 +
** Speculatively, we can consider the following motivation for the change:
 +
*** The old spin lock cannot atomically update both tickets with a single write. Thus, it is required to do two loops (one to update the current ticket, one to check if the obtained ticket is the active and the lock is taken).
 +
*** The new spin lock can atomically update both tickets with a single write. Thus, in the case where the lock is not held, the new spin lock only has to do one atomic loop.
 +
*** From this we can observe that the new spin lock is likely more performant under low contention (where it is expected that the lock is not held), however its downsides are potential false sharing (due to not owning the cache line). It is also probably better when at the start of a cache line and the locked data exists entirely within that cache line.
 +
*** Most kernel locks are expected to be relatively uncontended (and there aren't really cases where two locks are in the same cache line so false sharing isn't such a problem), and thus the switch to the new ARM reference manual style lock should lead to an overall performance upgrade.
 +
*** However, the scheduler lock is heavily contended (all cores will be locking it and unlocking it pretty much all the time). Thus, it makes more sense for it to continue using the old two-cache-line style lock, which performs better under high contention.
 +
* KProcess now has an additional data member storing the kernel virtual address of the process local region.
 +
** This is now used instead of the process virtual address for the tls region when writing context during exception handling.
 +
** This probably fixes a bug if an exception is being handled for a non-active process and the relevant codepath is taken(?)
 
<check back for more diffs later>
 
<check back for more diffs later>