Cutlass
CUDA Templates for Linear Algebra Subroutines and Solvers
|
#include <hgemm_swizzle.h>
Public Types | |
typedef GlobalIterator_ | GlobalIterator |
The global iterator. More... | |
typedef GlobalIterator::Fragment | Fragment |
The source fragment. More... | |
typedef GlobalIterator::FragmentShape | FragmentShape |
The shape of the source fragment. More... | |
typedef Fragment | InputFragment |
The input fragment. More... | |
typedef Fragment | OutputFragment |
The output fragment. More... | |
Public Member Functions | |
CUTLASS_DEVICE | HgemmSwizzle () |
The src/dst must be half fragments. More... | |
CUTLASS_DEVICE void | transform (Fragment const &src, Fragment &dst) |
Transform a fragment. More... | |
typedef GlobalIterator::Fragment cutlass::gemm::HgemmSwizzle< GlobalIterator_ >::Fragment |
typedef GlobalIterator::FragmentShape cutlass::gemm::HgemmSwizzle< GlobalIterator_ >::FragmentShape |
typedef GlobalIterator_ cutlass::gemm::HgemmSwizzle< GlobalIterator_ >::GlobalIterator |
typedef Fragment cutlass::gemm::HgemmSwizzle< GlobalIterator_ >::InputFragment |
typedef Fragment cutlass::gemm::HgemmSwizzle< GlobalIterator_ >::OutputFragment |
|
inline |
The number of elements must be a multiple of 2. Ctor.
|
inline |