Skip to content

Instantly share code, notes, and snippets.

View navyxliu's full-sized avatar
🍉
watermelon

xin liu navyxliu

🍉
watermelon
View GitHub Profile
CSE of MLIR is pretty much hash-based global value numbering. For each region, it Depth-First-Search(DFS) the dominator tree using a very efficient implementation.
https://github.com/llvm/llvm-project/blob/main/mlir/lib/Transforms/CSE.cpp#L333-L359
I learned is that we don't need to push all children to the stack. we can push just one child and increase the iterator!
currentNode->childIterator++.
There are 2 benefits:
1) worst space complexity is the height of the tree.
2) this code accesses memory sequentially.
@navyxliu
navyxliu / pacman.md
Created January 5, 2026 19:55
In SIMD programming, we express loops like pac-man.

In each loop iteration, the induction variable x increases BITE.

for (int x = 0; x < xsize; x+=BITE) {
   size_t x_size = min(xsize-x, BITE); 
   ...
}

It's just SIMD style programming. We want to use SIMD engine, but there are always remainder that is not aligned with VLEN.

there's another bug in remove-dead-values.
the problem is liveness analysis depends on mlir::dataflow::DeadCodeAnalysis.
without any reference of @main2, it doesn't participate progation of liveness. As a result, it has no idea arg0 and %arg1 of @immutable_fn_return_void_with_unused_argument.
RDV needs to delete @main2 entirely.
➜ build git:(backward_remove_dead_values) cat in3.mlir
// /bin/mlir-opt --remove-dead-values --debug-only=remove-dead-values --debug-only=liveness-analysis --debug-only=dataflow ./in3.mlir
func.func public @immutable_fn_return_void_with_unused_argument(%arg0: i32, %arg1: memref<4xf32>) -> () {
return
@navyxliu
navyxliu / input.mlir
Last active September 22, 2025 01:32
The remove-dead-value bug.
input.mlir
```
module @return_void_with_unused_argument {
// the function is immutable because it is public.
func.func public @immutable_fn_return_void_with_unused_argument(%arg0: i32, %unused: i32) -> () {
%sum = arith.addi %arg0, %arg0 : i32
%c0 = arith.constant 0 : index
%buf = memref.alloc() : memref<1xi32>
memref.store %sum, %buf[%c0] : memref<1xi32>
return
@navyxliu
navyxliu / vectorize_scf_for.mlir
Created July 19, 2025 17:05
vectorize_scf_for.mlir
module {
func.func @local_sparse_attention_kernel_03_bf(%arg0: i32, %arg1: i32, %arg2: i32, %arg3: i32, %arg4: i32, %arg5: i32, %arg6: i32, %arg7: !tt.ptr<f16>, %arg8: !tt.ptr<f16>, %arg9: !tt.ptr<f16>, %arg10: !tt.ptr<f16>) {
%c16_i32 = arith.constant 16 : i32
%cst = arith.constant 0.000000e+00 : f32
%cst_1 = arith.constant 0xFF800000 : f32
%c0_i32 = arith.constant 0 : i32
%c1_i32 = arith.constant 1 : i32
%alloc_5 = memref.alloc() {alignment = 64 : i64} : memref<1x128xf16>
%alloc_7 = memref.alloc() {alignment = 64 : i64} : memref<1x128xf32>
@navyxliu
navyxliu / EmptyString.md
Created March 28, 2024 05:10
EmptyString Example
// ./build/linux-x86_64-server-fastdebug/jdk/bin/java -XX:+PrintCompilation -XX:CompileCommand=compileonly,EmptyString::main  -XX:CompileCommand=quiet -Xbatch -XX:+PrintOptoAssembly -XX:CompileCommand=IGVPrintLevel,EmptyString::main,3 EmptyString
class EmptyString {
  public static void main(String[] args) {
        String msg = "";
        int x = 0;

        for (int i = 0; i < 300_000; ++i)  {
          if (msg.length() > 0) {
            x++;
@navyxliu
navyxliu / async_log_dropped.rb
Created March 19, 2024 04:21
A scripp to count how many messages are dropped.
#!/usr/bin/env ruby
lines = ARGF.readlines
total = 0
lines.each do |line|
if line =~ /(\d+)\smessages dropped due to async logging/
num = $1.to_i
total = total + num
#print line
end
end
@navyxliu
navyxliu / UnderProfiledSubprocedure.java
Created February 10, 2024 06:11
C2 inliner rejects a callee because of under-profiled subprocedure.
// java -XX:CompileOnly='UnderProfiledSubprocedure.foo' -XX:+PrintInlining -XX:+PrintCompilation -XX:CompileCommand=quiet -Xbatch UnderProfiledSubprocedure
import java.util.ArrayList;
class UnderProfiledSubprocedure {
private static int ODD = 100;
public void foo(boolean cond) {
var x = new ArrayList<Integer>();
if (cond) { // the branch is only taken by ODD
x.add(0); // ArrayList::add(E) is the subprocedure. it will call ArrayList::add(E, Object[], int)
@navyxliu
navyxliu / .zshrc
Created February 6, 2024 22:07
handy scripts on macOS
; my handy scripts on MacOS
function realpath() {
[[ $1 = /* ]] && echo "$1" || echo "$PWD/${1#./}"
}
function jdk_select() {
if [ -e $1/bin/java ]; then
P=$(realpath $1)
export JAVA_HOME=$P
export PATH=$JAVA_HOME/bin:$PATH
@navyxliu
navyxliu / jsc.md
Last active October 31, 2023 20:51
Building your own jsc

troubleshooting

➜  WebKit git:(main) ./WebKitBuild/Debug/jsc
dyld[8692]: Symbol not found: __ZN3JSC27retrieveTypeImportAttributeEPNS_14JSGlobalObjectERKN3WTF7HashMapINS2_6RefPtrINS2_17UniquedStringImplENS2_12RawPtrTraitsIS5_EENS2_21DefaultRefDerefTraitsIS5_EEEENS2_6StringENS2_11DefaultHashISA_EENS2_10HashTraitsISA_EENSE_ISB_EENS2_15HashTableTraitsEEE
  Referenced from: <BD3D10DA-F74B-3755-8B54-29D1F10163E5> /Users/xxinliu/Devel/WebKit/WebKitBuild/Debug/jsc (built for macOS 14.0 which is newer than running OS)
  Expected in:     <957522FA-9B44-3C8F-9BD4-A209C728B133> /System/Library/Frameworks/JavaScriptCore.framework/Versions/A/JavaScriptCore
[1]    8692 abort      ./WebKitBuild/Debug/jsc

The problem here is JSC depends on WTF(who would name its library that? it seems the acronym of webkit's template framework). Dynamic loader has difficulty to resolve this symbol in WTF. we need to teach dynamic laoder to get from what you built