Shortcuts

pytorchvideo.models.csn

pytorchvideo.models.csn.create_csn(*, input_channel=3, model_depth=50, model_num_class=400, dropout_rate=0, norm=<class 'torch.nn.modules.batchnorm.BatchNorm3d'>, activation=<class 'torch.nn.modules.activation.ReLU'>, stem_dim_out=64, stem_conv_kernel_size=(3, 7, 7), stem_conv_stride=(1, 2, 2), stem_pool=None, stem_pool_kernel_size=(1, 3, 3), stem_pool_stride=(1, 2, 2), stage_conv_a_kernel_size=(1, 1, 1), stage_conv_b_kernel_size=(3, 3, 3), stage_conv_b_width_per_group=1, stage_spatial_stride=(1, 2, 2, 2), stage_temporal_stride=(1, 2, 2, 2), bottleneck=<function create_bottleneck_block>, bottleneck_ratio=4, head_pool=<class 'torch.nn.modules.pooling.AvgPool3d'>, head_pool_kernel_size=(1, 7, 7), head_output_size=(1, 1, 1), head_activation=None, head_output_with_global_average=True)[source]

Build Channel-Separated Convolutional Networks (CSN): Video classification with channel-separated convolutional networks. Du Tran, Heng Wang, Lorenzo Torresani, Matt Feiszli. ICCV 2019.

CSN follows the ResNet style architecture including three parts: Stem, Stages and Head. The three parts are assembled in the following order:

Input
  ↓
Stem
  ↓
Stage 1
  ↓
  .
  .
  .
  ↓
Stage N
  ↓
Head

CSN uses depthwise convolution. To further reduce the computational cost, it uses low resolution (112x112), short clips (4 frames), different striding and kernel size, etc.

Parameters
  • input_channel (int) – number of channels for the input video clip.

  • model_depth (int) – the depth of the resnet. Options include: 50, 101, 152. model_num_class (int): the number of classes for the video dataset. dropout_rate (float): dropout rate.

  • norm (callable) – a callable that constructs normalization layer.

  • activation (callable) – a callable that constructs activation layer.

  • stem_dim_out (int) – output channel size to stem.

  • stem_conv_kernel_size (tuple) – convolutional kernel size(s) of stem.

  • stem_conv_stride (tuple) – convolutional stride size(s) of stem.

  • stem_pool (callable) – a callable that constructs resnet head pooling layer.

  • stem_pool_kernel_size (tuple) – pooling kernel size(s).

  • stem_pool_stride (tuple) – pooling stride size(s).

  • stage_conv_a_kernel_size (tuple) – convolutional kernel size(s) for conv_a.

  • stage_conv_b_kernel_size (tuple) – convolutional kernel size(s) for conv_b.

  • stage_conv_b_width_per_group (int) – the width of each group for conv_b. Set it to 1 for depthwise convolution.

  • stage_spatial_stride (tuple) – the spatial stride for each stage.

  • stage_temporal_stride (tuple) – the temporal stride for each stage.

  • bottleneck (callable) – a callable that constructs bottleneck block layer. Examples include: create_bottleneck_block.

  • bottleneck_ratio (int) – the ratio between inner and outer dimensions for the bottleneck block.

  • head_pool (callable) – a callable that constructs resnet head pooling layer.

  • head_pool_kernel_size (tuple) – the pooling kernel size.

  • head_output_size (tuple) – the size of output tensor for head.

  • head_activation (callable) – a callable that constructs activation layer.

  • head_output_with_global_average (bool) – if True, perform global averaging on the head output.

  • model_num_class (int) –

  • dropout_rate (float) –

Returns

(nn.Module) – the csn model.

Return type

torch.nn.modules.module.Module

Read the Docs v: latest
Versions
latest
stable
Downloads
pdf
html
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.