this makes it so VK and OGL backends map the NVIDIA's ETC2 into VK_FORMAT_ETC-whatever and GL_ETC-whatever remaps, instead of using the default fallback for AR8G8B8. in short, just make the ETC2 textures be submitted as ETC2 instead of being submit as A8R8G8B8.
Signed-off-by: lizzie lizzie@eden-emu.dev
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/3237
Reviewed-by: Ghost <>
Reviewed-by: crueter <crueter@eden-emu.dev>
the nominal std::unordered_map<> isn't enough to warrant it's continued usage in xbyak internal structures, thus using ankerl should greatly remove a lot of indirection/stdc++ specific overhead from the usually poorly performant std::unordered_map
Both dynarmic and macroHLE should benefit greatly from a less-stupid unordered_dense
This should speedup both CPU and shader compilation latency (NOT BY A GREAT MARGIN) just enough to make loading zones in ToTK less horrific
Signed-off-by: lizzie <lizzie@eden-emu.dev>
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/3716
Reviewed-by: crueter <crueter@eden-emu.dev>
so basically each construction of HLEContext and whatever would result in a heap allocation (atleast 1)
so what if instead of that we did a memset() at ctor time and we avoided heap allocations altogether?
reminder that std::vector<> CAN do small object optimisation but it's not guaranteed
Signed-off-by: lizzie <lizzie@eden-emu.dev>
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/3605
Reviewed-by: crueter <crueter@eden-emu.dev>
also devirtualizes manually since compiler doesn't do it with LTO
Signed-off-by: lizzie <lizzie@eden-emu.dev>
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/3864
Reviewed-by: crueter <crueter@eden-emu.dev>
inputs shouldnt be that critical to require a full mutex of them
this relies on CPU guaranteeing u32/u16/u8 atomic load/stores for EmulatedController fields, which works on x86_64 but may not have the same behaviour on other architectures - thats why i wrap them in `std::atomic<>`
Signed-off-by: lizzie <lizzie@eden-emu.dev>
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/3866
Reviewed-by: crueter <crueter@eden-emu.dev>
sounds like word salad but let me say:
- std::map<> created a static ctor for EVERY SINGLE ZONEINFO
- fuck that, instead lets just use a raw array and construct things statically
- works the same except with less baggage carried around (+ less heap allocations!!!)
this should help reduce codesize due to the aforementioned global ctor/dtor
Signed-off-by: lizzie <lizzie@eden-emu.dev>
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/3919
Reviewed-by: crueter <crueter@eden-emu.dev>
see https://github.com/herumi/xbyak/issues/255
> Proof: https://godbolt.org/z/9vseq4Ynj
> Xbyak currently implements it as:
> ```c++
> void umonitor(const Reg& r) {
> int idx = r.getIdx();
> if (idx > 7) XBYAK_THROW(ERR_BAD_PARAMETER) //umonitor DOES accept r8,r9,r10,etc this is NOT correct
> int bit = r.getBit();
> if (BIT != bit) {
> if ((BIT == 32 && bit == 16) || (BIT == 64 && bit == 32)) {
> db(0x67);
> } else {
> XBYAK_THROW(ERR_BAD_SIZE_OF_REGISTER)
> }
> }
> db(0xF3); db(0x0F); db(0xAE); setModRM(3, 6, idx);
> }
> ```
> My program was throwing Xbyak::Exception and I tracked it down to this particular umonitor
Signed-off-by: lizzie <lizzie@eden-emu.dev>
Reviewed-on: https://git.eden-emu.dev/eden-emu/eden/pulls/3954
Reviewed-by: crueter <crueter@eden-emu.dev>
Reviewed-by: MaranBr <maranbr@eden-emu.dev>