tensormonk.layers¶
Attention, convolution, linear and vision layers.
CondConv2d¶
-
class
CondConv2d
(tensor_size: tuple, n_experts: int, filter_size: int, out_channels: int, strides: int = 1, pad: bool = True, groups: int = 1)[source]¶ Conditional Convolution (“CondConv: Conditionally Parameterized Convolutions for Efficient Inference”).
- Parameters
tensor_size (tuple, required) – Input tensor shape in BCHW (None/any integer >0, channels, height, width).
n_kernels (int, required) – number of kernels that are used for routing.
filter_size (tuple/int, required) – size of kernel, integer or tuple of length 2.
out_channels (int, required) – output tensor.size(1)
strides (int/tuple, optional) – integer or tuple of length 2, (default=:obj:1).
pad (bool, optional) – When True, pads to replicates input size for strides=1 (default=:obj:True).
groups (int, optional) – Enables grouped convolution (default=:obj:1).
- Return type
torch.Tensor
# TODO: Include normalization and activation similar to Convolution?
Attention Layers¶
All attention layers.
LocalAttention¶
-
class
LocalAttention
(tensor_size: tuple, filter_size: Union[int, tuple], out_channels: int, strides: int = 1, groups: int = 4, bias: bool = False, replicate_paper: bool = True, normalize_offset: bool = False, **kwargs)[source]¶ LocalAttention (“Stand-Alone Self-Attention in Vision Models”).
- Parameters
tensor_size (tuple, required) – Input tensor shape in BCHW (None/any integer >0, channels, height, width).
filter_size (int/tuple, required) – size of kernel, integer or list/tuple of length 2.
out_channels (int, required) – output tensor.size(1)
strides (int/tuple, optional) – convolution stride (default =
1
).groups (int, optional) – enables grouped convolution (default =
4
).bias (bool) – When True, key, query & value 1x1 convolutions have bias (default =
False
).replicate_paper (bool, optional) – When False, relative attention logic is different from that of paper (default =
True
).normalize_offset (bool, optional) – When True (and replicate_paper =
False
), normalizes the row and column offsets (default =False
).
- Return type
torch.Tensor
SelfAttention¶
-
class
SelfAttention
(tensor_size: tuple, shrink: int = 8, scale_factor: float = 1.0, return_attention: bool = False, **kwargs)[source]¶ Self-Attention (“Self-Attention Generative Adversarial Networks”).
- Parameters
tensor_size (tuple, required) – Input tensor shape in BCHW (None/any integer >0, channels, height, width).
shrink (int, optional) – Used to compute output channels of key and query, i.e, int(tensor_size[1] / shrink), (default =
8
).scale_factor (float, optional) – Scale at which attention is computed. (use scale_factor <1 for speed). When scale_factor != 1, input is scaled using nearest neighbor interpolation (default =
1
).return_attention (bool, optional) – When True, returns a tuple (output, attention) (default =
False
).
- Return type
torch.Tensor
LucasKanade¶
-
class
LucasKanade
(n_steps: int = 64, width: int = 15, sigma: Optional[int] = None)[source]¶ Lucas-Kanade tracking (based on “Supervision-by-Registration: An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors”).
A cleaner version based on original repo with some corrections (yx must be xy) and speed improvements.
- Parameters
- Return type
torch.Tensor
-
forward
(frame_t0: torch.Tensor, frame_t1: torch.Tensor, points_xy: torch.Tensor)[source]¶ Tracks points_xy on frame_t0 to frame_t1.
- Parameters
frame_t0 (torch.Tensor) – 4D tensor of shape BCHW.
frame_t1 (torch.Tensor) – 4D tensor of shape BCHW.
points_xy (torch.Tensor) – 3D tensor of shape B x n_points x 2.
- Return type
torch.Tensor