Skip to content

Instantly share code, notes, and snippets.

View jammm's full-sized avatar

Aaryaman Vasishta jammm

View GitHub Profile
@jammm
jammm / collect.md
Last active December 16, 2025 22:50
Collecting data on MIOpen and hipBLASLt shapes

Are you facing slow performance when running your models using ComfyUI/SD WebUI or any pytorch program using your Radeon 9070XT, AI Pro R9700, or Strix Halo (Radeon 8060S) ? Then we need your help! Please provide us performance logs when running your models. It will help us tune our libraries for better performance on your models.

Please set the following environment variables depending on your OS:

Windows (powershell)

$env:MIOPEN_ENABLE_LOGGING=1
$env:MIOPEN_ENABLE_LOGGING_CMD=1
$env:HIPBLASLT_LOG_MASK=32
$env:TORCH_BLAS_PREFER_HIPBLASLT=1
@jammm
jammm / validate_pytorch_vroom.py
Last active October 30, 2025 22:18
PyTorch script that does a quick benchmark of GEMM, Conv and SDPA (with FA backend, to check aotriton usage)
import torch
from torch.nn.functional import scaled_dot_product_attention
from torch.nn.attention import SDPBackend
###############################################################################
# Check for GPU
###############################################################################
if not torch.cuda.is_available():
raise SystemExit("CUDA GPU is not available. Please run on a CUDA-enabled device.")
[557/563] Linking CXX shared library bin\torch_hip.dll
FAILED: [code=4294967295] bin/torch_hip.dll lib/torch_hip.lib
C:\WINDOWS\system32\cmd.exe /C "cd . && D:\jam\venv\Lib\site-packages\cmake\data\bin\cmake.exe -E vs_link_dll --msvc-ver=1944 --intdir=caffe2\CMakeFiles\torch_hip.dir --rc=C:\PROGRA~2\WI3CF2~1\10\bin\100261~1.0\x64\rc.exe --mt=C:\PROGRA~1\MICROS~1\2022\COMMUN~1\VC\Tools\Llvm\x64\bin\llvm-mt.exe --manifests -- D:\jam\venv\Lib\site-packages\_rocm_sdk_devel\lib\llvm\bin\lld-link.exe /nologo @CMakeFiles\torch_hip.rsp /out:bin\torch_hip.dll /implib:lib\torch_hip.lib /pdb:bin\torch_hip.pdb /dll /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 /INCREMENTAL:NO && cd ."
LINK: command "D:\jam\venv\Lib\site-packages\_rocm_sdk_devel\lib\llvm\bin\lld-link.exe /nologo @CMakeFiles\torch_hip.rsp /out:bin\torch_hip.dll /implib:lib\torch_hip.lib /pdb:bin\torch_hip.pdb /dll /version:0.0 /machine:x64 /ignore:4049 /ignore:4217 /ignore:4099 /INCREMENTAL:NO /MANIFEST:EMBED,ID=2" failed (exit code 1)
@jammm
jammm / gist:543a12eb04c925f07d8dfac5a28e900e
Last active July 26, 2025 16:08
hipblaslt gfx1200 failures
hipBLASLt version: 100000
hipBLASLt git version: faee7ce8fe
Query device success: there are 1 devices. (Target device ID is 0)
Device ID 0 : AMD Radeon Graphics gfx1200
with 17.1 GB memory, max. SCLK 2740 MHz, max. MCLK 1258 MHz, compute capability 12.0
maxGridDimX 2147483647, sharedMemPerBlock 65.5 KB, maxThreadsPerBlock 1024, warpSize 32
info: parsing of test data may take a couple minutes before any test output appears...
...
@jammm
jammm / hipblaslt_logs_while_running_stable_fast_3d
Created July 23, 2025 12:43
hipBLASLt logs while trying to run stable-fast-3d on gfx1200+Linux
This file has been truncated, but you can view the full file.
(venv) nod@Shark49:~/jam/stable-fast-3d$ HIPBLASLT_LOG_LEVEL=5 HIPBLASLT_LOG_MASK=32 python gradio_app.py
Deleting /tmp/gradio
/home/nod/jam/stable-fast-3d/venv/lib/python3.12/site-packages/timm/models/layers/__init__.py:48: FutureWarning: Importing from timm.models.layers is deprecated, please import via timm.layers
warnings.warn(f"Importing from {__name__} is deprecated, please import via timm.layers", FutureWarning)
/home/nod/jam/stable-fast-3d/venv/lib/python3.12/site-packages/gradio_client/utils.py:1097: UserWarning: file() is deprecated and will be removed in a future version. Use handle_file() instead.
warnings.warn(
Running on local URL: http://127.0.0.1:7860
To create a public link, set `share=True` in `launch()`.
/home/nod/jam/stable-fast-3d/venv/lib/python3.12/site-packages/gradio/analytics.py:106: UserWarning: IMPORTANT: You are using gradio version 4.41.0, however version 4.44.1 is available, please upgrade.
diff --git a/CMakeLists.txt b/CMakeLists.txt
index 75aa3a572..a9ea115c1 100644
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -585,6 +585,9 @@ endif()
set(KERNELS_SOURCE_DIR ${PROJECT_SOURCE_DIR}/src/kernels)
+list(PREPEND CMAKE_PROGRAM_PATH
+ "C:/Users/jam/Downloads/bzip2-1.0.5-bin/bin")
### Keybase proof
I hereby claim:
* I am jammm on github.
* I am jammm (https://keybase.io/jammm) on keybase.
* I have a public key ASAWhnIMXNOj7bc3GEIp_IJZ1e3iblcttfTWeghKjV_3mAo
To claim this, I am signing this object: