In this article, we will see how to use
CRTP,std::variantandstd::visitto increase our code performances.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| mlir-opt matmult.mlir -convert-linalg-to-loops -lower-affine -convert-scf-to-cf -convert-linalg-to-llvm -convert-memref-to-llvm -convert-func-to-llvm -reconcile-unrealized-casts > out.mlir | |
| mlir-cpu-runner out.mlir -O3 -e main -entry-point-result=void --shared-libs=libmlir_runner_utils.dylib |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| .root { | |
| display: block; | |
| position: relative; | |
| } | |
| .lqip { | |
| image-rendering: pixelated; | |
| width: 100%; | |
| opacity: 1; | |
| transition: opacity 50ms 100ms ease-out; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import React, { Component } from 'react'; | |
| import styled from 'styled-components'; | |
| const Figure = styled.figure` | |
| height: 0; | |
| margin: 0; | |
| background-color: #efefef; | |
| position: relative; | |
| padding-bottom: ${props => props.ratio}%; | |
| `; |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| SDK = xcrun -sdk macosx | |
| all: compute.metallib compute | |
| compute.metallib: Compute.metal | |
| # Metal intermediate representation (.air) | |
| $(SDK) metal -c -Wall -Wextra -std=osx-metal2.0 -o /tmp/Compute.air $^ | |
| # Metal library (.metallib) | |
| $(SDK) metallib -o $@ /tmp/Compute.air |
This is a short post that explains how to write a high-performance matrix multiplication program on modern processors. In this tutorial I will use a single core of the Skylake-client CPU with AVX2, but the principles in this post also apply to other processors with different instruction sets (such as AVX512).
Matrix multiplication is a mathematical operation that defines the product of