* Use the jump table for tail continues.
This path is always reached when a function larger than our current length limit (currently 5000) is compiled.
* Use Call Flag rather than 1L
* libhac: use ApplicationControlProperty instead of Nacp
Nacp was marked as deprecated in 0.10.0, this PR remove all usage of it
from Ryujinx and use the new struct ApplicationControlProperty.
* Address Moose's comment
* Delete old HLE.Input
* Add new HLE Input.
git shows Hid.cs as modified because of the same name. It is new.
* Change HID Service
* Change Ryujinx UI to reflect new Input
* Add basic ControllerApplet
* Add DebugPad
Should fix Kirby Star Allies
* Address Ac_K's comments
* Moved all of HLE.Input to Services.Hid
* Separated all structs and enums each to a file
* Removed vars
* Made some naming changes to align with switchbrew
* Added official joycon colors
As an aside, fixed a mistake in touchscreen headers and added checks to
important SharedMem structs at init time.
* Further address Ac_K's comments
* Addressed gdkchan's and some more Ac_K's comments
* Address AcK's review comments
* Address AcK's second review comments
* Replace missed Marshal.SizeOf and address gdkchan's comments
* Add warning and option to disable when loading game with debug logs
* Add warning and option to disable when shader dumping is enabled
* Edit text and add title
* Reduce requirements for running homebrews
This commit change the following behaviours:
- TimeZoneBinary system archive isn't required until guest code call LoadTimeZoneRule.
- Fonts system archives aren't requred until a "pl:u" IPC call is made.
- Custom font support was dropped.
- TimeZoneBinary missing message is now an error and not a warning.
* Address comments
* Index constant buffer vec4s using ternary expressions.
* Remove indexed path.
We determined that it had negligible impact.
* Revert "Remove indexed path."
This reverts commit 25ec4eddfa441e802bd957dfaabc83b23c6bae38.
* Revert "Revert "Remove indexed path.""
This reverts commit 7cd52fecb529dcb9e1a574533bd38531319f1268.
* Add a dialog to make sure user wants to stop emulation when esc is pressed
* Remove unneccesary space
* Fix formatting
* Remove unnessecary spaces
* Fix formatting for member of GtkDialog
* Fix unhandled UnauthorizedAccessException causing crash while listing directories
* Actually handle not having privileges for a directory
* Fix log message when not having privileges for a directory
* Remove unneccesary empty lines
* Remove unneccecssary space
* prepo: Implement RequestImmediateTransmission and GetTransmissionStatus
This implement RequestImmediateTransmission and GetTransmissionStatus of the prepo service accurately to RE.
Since we don't use reports, I've explained what the calls do in the real service.
Close#958
* fix comment
It seems MsgPack.Cli incorrectly converts some unicode control characters directly to JSON literal Unicode strings (eg. '\1'). As these are invalid, pretty printing the JSON report fails.
This removes pretty printing, pending a proper implementation, and tidies up MsgPack deserialization.
This implement `GetRegionCode` accordingly to RE. I've added a setting in the GUI and a field in the Configuration file with a way to update the Configuration file if needed.
* Start of JIT garbage collection improvements
- thread static pool for Operand, MemoryOperand, Operation
- Operands and Operations are always to be constructed via their static
helper classes, so they can be pooled.
- removing LinkedList from Node for sources/destinations (replaced with
List<>s for now, but probably could do arrays since size is bounded)
- removing params constructors from Node
- LinkedList<> to List<> with Clear() for Operand assignments/uses
- ThreadStaticPool is very simple and basically just exists for the
purpose of our specific translation allocation problem. Right now it
will stay at the worst case allocation count for that thread (so far) -
the pool can never shrink.
- Still some cases of Operand[] that haven't been removed yet. Will need
to evaluate them (eg. is there a reasonable max number of params for
Calls?)
* ConcurrentStack instead of ConcurrentQueue for Rejit
* Optimize some parts of LSRA
- BitMap now operates on 64-bit int rather than 32-bit
- BitMap is now pooled in a ThreadStatic pool (within lrsa)
- BitMap now is now its own iterator. Marginally speeds up iterating
through the bits.
- A few cases where enumerators were generated have been converted to
forms that generate less garbage.
- New data structure for sorting _usePositions in LiveIntervals. Much
faster split, NextUseAfter, initial insertion. Random insertion is
slightly slower.
- That last one is WIP since you need to insert the values backwards. It
would be ideal if it just flipped it for you, uncomplicating things on
the caller side.
* Use a static pool of thread static pools. (yes.)
Prevents each execution thread creating its own lowCq pool and making me cry.
* Move constant value to top, change naming convention.
* Fix iteration of memory operands.
* Increase max thread count.
* Address Feedback
REV8 only added changes on performance buffer and wavebuffer dsp command
generation.
As we don't support any of those, we can just increment the revision
number for now.
* Implement Jump Table for Native Calls
NOTE: this slows down rejit considerably! Not recommended to be used
without codegen optimisation or AOT.
- Does not work on Linux
- A32 needs an additional commit.
* A32 Support
(WIP)
* Actually write Direct Call pointers to the table
That would help.
* Direct Calls: Rather than returning to the translator, attempt to keep within the native stack frame.
A return to the translator can still happen, but only by exceptionally
bubbling up to it.
Also:
- Always translate lowCq as a function. Faster interop with the direct
jumps, and this will be useful in future if we want to do speculative
translation.
- Tail Call Detection: after the decoding stage, detect if we do a tail
call, and avoid translating into it. Detected if a jump is made to an
address outwith the contiguous sequence of blocks surrounding the entry
point. The goal is to reduce code touched by jit and rejit.
* A32 Support
* Use smaller max function size for lowCq, fix exceptional returns
When a return has an unexpected value and there is no code block
following this one, we now return the value rather than continuing.
* CompareAndSwap (buggy)
* Ensure CompareAndSwap does not get optimized away.
* Use CompareAndSwap to make the dynamic table thread safe.
* Tail call for linux, throw on too many arguments.
* Combine CompareAndSwap 128 and 32/64.
They emit different IR instructions since their PreAllocator behaviour
is different, but now they just have one function on EmitterContext.
* Fix issues separating from optimisations.
* Use a stub to find and execute missing functions.
This allows us to skip doing many runtime comparisons and branches, and reduces the amount of code we need to emit significantly.
For the indirect call table, this stub also does the work of moving in the highCq address to the table when one is found.
* Make Jump Tables and Jit Cache dynmically resize
Reserve virtual memory, commit as needed.
* Move TailCallRemover to its own class.
* Multithreaded Translation (based on heuristic)
A poor one, at that. Need to get core count for a better one, which
means a lot of OS specific garbage.
* Better priority management for background threads.
* Bound core limit a bit more
Past a certain point the load is not paralellizable and starts stealing from the main thread. Likely due to GC, memory, heap allocation thread contention. Reduce by one core til optimisations come to improve the situation.
* Fix memory management on linux.
* Temporary solution to some sync problems.
This will make sure threads exit correctly, most of the time. There is a potential race where setting the sync counter to 0 does nothing (counter stays at what it was before, thread could take too long to exit), but we need to find a better way to do this anyways. Synchronization frequency has been tightened as we never enter blockwise segments of code. Essentially this means, check every x functions or loop iterations, before lowcq blocks existed and were worth just as much. Ideally it should be done in a better way, since functions can be anywhere from 1 to 5000 instructions. (maybe based on host timer, or an interrupt flag from a scheduler thread)
* Address feedback minus CompareAndSwap change.
* Use default ReservedRegion granularity.
* Merge CompareAndSwap with its V128 variant.
* We already got the source, no need to do it again.
* Make sure all background translation threads exit.
* Fix CompareAndSwap128
Detection criteria was a bit scuffed.
* Address Comments.