Cutlass
CUDA Templates for Linear Algebra Subroutines and Solvers
|
Tiling in which warps rake across the contiguous dimension.
#include <tile_traits_standard.h>
Classes | |
struct | ThreadOffset |
Computes the thread offset in (H, W) based on thread ID. More... | |
Public Types | |
typedef Tile_ | Tile |
Shape of tile. More... | |
typedef Shape< 1, kWarpsStrided, kWarpsContiguous *kWarpSize > | ThreadShape |
Arrangement of threads. More... | |
typedef Shape< 1, kWarpsStrided, kWarpSize > | Delta |
The same warp rakes along the contiguous dimension. More... | |
typedef Shape< 1, Tile::kH/Delta::kH, Tile::kW/ThreadShape::kW > | Iterations |
Number of iterations. More... | |
Static Public Attributes | |
static int const | kThreads = Threads |
Number of participating threads. More... | |
static int const | kWarpSize = 32 |
Hard-coded warp size. More... | |
static int const | kWarpCount = kThreads / kWarpSize |
Number of participating warps. More... | |
static int const | kWarpsStrided = __NV_STD_MIN(kWarpCount, Tile::kH) |
Warps strip-mined across strided dimension. More... | |
static int const | kWarpsContiguous = kWarpCount / kWarpsStrided |
Warps stripmined contiguous dimension. More... | |
typedef Shape<1, kWarpsStrided, kWarpSize> cutlass::TileTraitsWarpRake< Tile_, Threads >::Delta |
typedef Shape<1, Tile::kH / Delta::kH, Tile::kW / ThreadShape::kW> cutlass::TileTraitsWarpRake< Tile_, Threads >::Iterations |
typedef Shape<1, kWarpsStrided, kWarpsContiguous * kWarpSize> cutlass::TileTraitsWarpRake< Tile_, Threads >::ThreadShape |
typedef Tile_ cutlass::TileTraitsWarpRake< Tile_, Threads >::Tile |
|
static |
|
static |
|
static |
|
static |
|
static |