Cutlass
CUDA Templates for Linear Algebra Subroutines and Solvers
|
Defines iterator traits for efficiently loading and storing fragment to and from shared memory, specialized for WMMA GEMM. More...
#include "cutlass/wmma_matrix.h"