New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[TKW] Kernel Cacher #329

Merged

raikonenfnu merged 8 commits into iree-org:main from raikonenfnu:kernelCacher

Dec 12, 2024

Contributor

raikonenfnu commented Dec 11, 2024

For eager mode to be viable, we'd need to implement Kernel Cacher S.T we do not need to re-compile kernels every time. Here are the main changes:

Refactor wave.py's compile_and_invoke to two separate functions compile_to_vmfb, and invoke_vmfb this is S.T we can intercept the compiled vmfb cleanly and store it to the caches.
Implement kernel cache dataclass which is a struct necessary to reconstruct kernels S.T it is invokable as original state
Implement fn to invoke kernel cache
Implement kernel cache manager that can hash, load/store kernel to RAM, load/store kernel to files.
Tests and helper fn for cache manager.

Let's discuss a little more about the newly developed cache manager, the Wave cache manager has two main components/cache:

Session/Online cache - This is the main cache that our compiler and runtime will load from and store to. It is essentially a dict that uses the kernel hash as keys and the WaveCache as values. We added LRU functionality with limits for number of kernel cached here, because this lives on RAM, and we wouldn't want to run OOM.
File/Offline cache - This cache is essential for loading saved/compiled cache between sessions/runs. This is done by storing vital kernel information (vmfb, kernel_sig,and mlir) to CACHE_BASE_DIR/kernel_hash directory. If said kernel is queried during a new run and does not exist on session/online cache yet, we'd load files from the kernel_hash directory and reconstruct the WaveCache from it.

raikonenfnu requested review from harsh-nod and Hardcode84

December 11, 2024 16:53

Hardcode84 reviewed

View reviewed changes

iree/turbine/kernel/wave/cache.py Outdated

+                      Get a unique identifier for a given kernel.
+                      """
+                      key = [
+                          inspect.getsource(kernel_body),

Contributor

Hardcode84 Dec 11, 2024

Function source code is not always possible to retrieve and inspect.getsource will raise an error. We can live with this limitation for now, but in this case we need a way to bypass caching.

Contributor Author

raikonenfnu Dec 12, 2024

do you know how/which cases it will fail to do inspect.getsource? I can add some checker there

Contributor Author

raikonenfnu Dec 12, 2024

For now I have it on try and except not the nicest aesthetically but should do the job!

Contributor

Hardcode84 Dec 12, 2024

The main case is when function constructed dynamically via eval/exec. It is also possible to distribute python programs as bytecode (.pyc) without source code.

Hardcode84 reviewed

View reviewed changes

iree/turbine/kernel/wave/cache.py Outdated Show resolved Hide resolved

Hardcode84 reviewed

View reviewed changes

iree/turbine/kernel/wave/cache.py Outdated Show resolved Hide resolved

raikonenfnu force-pushed the kernelCacher branch from 75ab666 to d513890 Compare

December 12, 2024 04:57

harsh-nod reviewed

View reviewed changes

iree/turbine/kernel/wave/expansion.py Outdated Show resolved Hide resolved

raikonenfnu added 4 commits

December 11, 2024 23:32


          [TKW] Kernel Cacher

63b7bfa

For eager mode to be viable, we'd need to implement Kernel Cacher S.T we
do not need to re-compile kernels every time. Here are the main changes:

1. Refactor wave.py's compile_and_invoke to two separate functions
   `compile_to_vmfb`, and `invoke_vmfb` this is S.T we can intercept the
compiled vmfb cleanly and store it to the caches.
2. Implement kernel cache dataclass which is a struct necessary to
   reconstruct kernels S.T it is invokable as original state
3. Implement fn to invoke kernel cache
4. Implement kernel cache manager that can hash, load/store kernel to RAM,
   load/store kernel to files.
5. Tests and helper fn for cache manager.

Let's discuss a little more about the newly developed cache manager, the
Wave  cache manager has two main components/cache:

1. Session/Online cache - This is the main cache that our compiler and runtime
will load from and store to. It is essentially a dict that uses the kernel hash
as keys and the WaveCache as values. We added LRU functionality with limits for
number of kernel cached here, because this lives on RAM, and we wouldn't want to run OOM.

2. File/Offline cache - This cache is essential for loading saved/compiled cache
between sessions/runs. This is done by storing vital kernel information
(vmfb, kernel_sig,and mlir) to CACHE_BASE_DIR/kernel_hash directory. If said kernel
is queried during a new run and does not exist on session/online cache yet, we'd load
files from the kernel_hash directory and reconstruct the WaveCache from it.

Signed-off-by: Stanley Winata <[email protected]>


          Fix NITs, handle inspectsource fail and Move around CIs

595e71c

Signed-off-by: Stanley Winata <[email protected]>


          add thread locks

9724bac

Signed-off-by: Stanley Winata <[email protected]>


          remove self.finalize and other NITs

61cb937

Signed-off-by: Stanley Winata <[email protected]>

harsh-nod approved these changes

View reviewed changes

Contributor

harsh-nod left a comment

lgtm! thanks for putting this together and will be a big boost to users. Just had some stylistic comments but looks good overall!

iree/turbine/kernel/wave/cache.py Outdated Show resolved Hide resolved

iree/turbine/kernel/wave/cache.py Outdated

+                      cur_cache_dir = f"{CACHE_BASE_DIR}/{kernel_hash}"
+                      if not os.path.exists(cur_cache_dir):
+                          os.makedirs(cur_cache_dir)
+                      cur_vmfb_path = f"{cur_cache_dir}/{kernel_hash}.vmfb"

Contributor

harsh-nod Dec 12, 2024

minor: can you switch to using Pathlib instead of manually concatenating the strings which gives you lots of additional benefits.

Contributor Author

raikonenfnu Dec 12, 2024

done!

iree/turbine/kernel/wave/cache.py

+                      """
+                      Save given kernel(vmfb, kernel_sig, and MLIR) into session_cache and file/offline cache.
+                      """
+                      if not WAVE_CACHE_ON or not kernel_hash:

Contributor

harsh-nod Dec 12, 2024

If WAVE_ALWAYS_COMPILE=1, should we store the kernel to the cache?

Contributor Author

raikonenfnu Dec 12, 2024

I think we can store it to file cache, S.T we can use it to debug stuff (Triton has the exact same flag), which I had found useful. But I can make sure it doesn't store to the online/session cache.

iree/turbine/kernel/wave/wave.py Show resolved Hide resolved

raikonenfnu force-pushed the kernelCacher branch from 6303418 to 833ff8f Compare

December 12, 2024 08:38

raikonenfnu added 4 commits

December 12, 2024 00:40


          fix nits

833ff8f

Signed-off-by: Stanley Winata <[email protected]>


          Use more pathlibs Path

Signed-off-by: Stanley Winata <[email protected]>


          Add index_mapping exploration and integrate into key hash

32384b4

Signed-off-by: Stanley Winata <[email protected]>


          Add arg_type into hash

4fa6e18

Signed-off-by: Stanley Winata <[email protected]>

raikonenfnu merged commit d9be797 into iree-org:main

10 checks passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet