This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| // A64FX SVE 1.0 High-Performance FP16 GEMM with FP32 Accumulation | |
| // Target: ~160% of FP32 peak (~80% of FP16 peak) | |
| // | |
| // Strategy: | |
| // - Use FP16 FMLA directly for maximum throughput (32 ops/instruction) | |
| // - Accumulate in FP16 for K_INTERVAL iterations | |
| // - Convert to FP32 and add to FP32 accumulators periodically | |
| // - Software pipeline: overlap loads, converts, and FMAs | |
| // | |
| // A64FX FP16 peak: ~264 GFLOPS/core (2x FP32) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #include <iostream> | |
| #include <string> | |
| static int _ReadUTF8(char const *&cp, std::string *errMsg) | |
| { | |
| // Return a byte with the high `n` bits set, rest clear. | |
| auto highBits = [](int n) { | |
| return static_cast<unsigned char>(((1 << n) - 1) << (8 - n)); | |
| }; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/bin/bash | |
| VERSION="1.0.2" | |
| # Interface connect to out lan | |
| INTERFACE="eth0" | |
| # Interface virtual for incomming traffic | |
| VIRTUAL="ifb0" | |
| # set the direction (1 = outgoing only, 2 = incoming only 3 = both) | |
| DIRECTION=3 | |
| # Speed |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| // Issue | |
| // * Allocate and provice insufficient data size(e.g. vertex data) fo GAS input. | |
| // * Call `optixAccelBuild` with "correct" buffer size(e.g. `buildInput.triangleArray.numVertices`) | |
| // * Build GAS against invalid data but no error report at this time | |
| // * Some memory object causes 'illegal memory access' error when freeing(also happens in CUDA driver API) | |
| // How to reprocible | |
| // | |
| // OptiX 7.1 optixHair SDK sample | |
| // Uncomment following in Util.h |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #include <arm_neon.h> | |
| #include <cstdio> | |
| #include <limits> | |
| #include <cassert> | |
| #include <cmath> | |
| #include <cstdint> | |
| bool check_snan(float f) | |
| { | |
| bool is_nan = std::isnan(f); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #include <arm_neon.h> | |
| #include <cstdio> | |
| #include <limits> | |
| #include <cassert> | |
| #include <cmath> | |
| #include <cstdint> | |
| bool check_snan(float f) | |
| { | |
| bool is_nan = std::isnan(f); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| // | |
| // C++ implementaion of "A simple method to construct isotropic quasirandom blue | |
| // noise point sequences" | |
| // | |
| // http://extremelearning.com.au/a-simple-method-to-construct-isotropic-quasirandom-blue-noise-point-sequences/ | |
| // | |
| // Assume 0 <= x | |
| static double myfmod(double x) { return x - std::floor(x); } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| // Writing (&x)[i] in operator[] is safe/correct C++ code or not? | |
| // The following is the reduced code fragment(Thus not work by just copy&paste) which Intel C++ compier(ver 13 and 15) miscompiles(Release build only) the code for the access to real3 object through operator[] inside OpenMP loop. | |
| // clang and gcc are OK to compile&run | |
| typedef float real; | |
| struct real3 { | |
| real3() {} | |
| real3(real xx, real yy, real zz) { | |
| x = xx; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/bin/bash | |
| rm -rf CMake* | |
| export NDK=/home/syoyo/local/android-ndk-r10e | |
| export SYSROOT=$NDK/platforms/android-21/arch-arm64 | |
| export CC="$NDK/toolchains/aarch64-linux-android-4.9/prebuilt/linux-x86_64/bin/aarch64-linux-android-gcc" | |
| export CXX="$NDK/toolchains/aarch64-linux-android-4.9/prebuilt/linux-x86_64/bin/aarch64-linux-android-g++" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| #!/usr/bin/env python | |
| import ninja_syntax | |
| import glob | |
| import os | |
| cxx_files = [ | |
| "src/OptionParser.cpp" | |
| , "src/easywsclient.cpp" | |
| , "src/json_to_eson.cc" |
NewerOlder