tilegym-cutile-autotuning
by NVIDIA
Use when adding, modifying, optimizing, or debugging CuTile autotuning code. Trigger signals: `exhaustive_search` / `replace_hints` / `hints_fn` / `cuda.tile.tune` in code, `autotune` in filenames, or correctness/performance issues in autotuned CuTile kernels. Covers: tune-once/cache/launch pattern, per-architecture configs (sm80βsm120), parameter space design (tile sizes, occupancy, num_ctas), and 7 common pitfalls with solutions.
Description
1
Installs
Install tilegym-cutile-autotuning with One Click
Get a managed OpenClaw server and install this skill from your dashboard. No SSH, no Docker, no configuration needed.
Deploy with ClawHost