pytorchvideo.models.net¶

class pytorchvideo.models.net.Net(*, blocks)[source]¶

Build a general Net models with a list of blocks for video recognition.

Input
  ↓
Block 1
  ↓
  .
  .
  .
  ↓
Block N
  ↓

The ResNet builder can be found in create_resnet.

__init__(*, blocks)[source]¶

Parameters: blocks (torch.nn.module_list) – the list of block modules.
Return type: None

class pytorchvideo.models.net.DetectionBBoxNetwork(model, detection_head)[source]¶

A general purpose model that handles bounding boxes as part of input.

__init__(model, detection_head)[source]¶

Parameters

model (nn.Module) – a model that preceeds the head. Ex: stem + stages.
detection_head (nn.Module) – a network head. that can take in input bounding boxes and the outputs from the model.

forward(x, bboxes)[source]¶

Parameters

x (torch.tensor) – input tensor
bboxes (torch.tensor) – accociated bounding boxes. The format is N*5 (Index, X_1,Y_1,X_2,Y_2) if using RoIAlign and N*6 (Index, x_ctr, y_ctr, width, height, angle_degrees) if using RoIAlignRotated.

class pytorchvideo.models.net.MultiPathWayWithFuse(*, multipathway_blocks, multipathway_fusion, inplace=True)[source]¶

Build multi-pathway block with fusion for video recognition, each of the pathway contains its own Blocks and Fusion layers across different pathways.

Pathway 1  ... Pathway N
    ↓              ↓
 Block 1        Block N
    ↓⭠ --Fusion----↓

__init__(*, multipathway_blocks, multipathway_fusion, inplace=True)[source]¶

Parameters

multipathway_blocks (nn.module_list) – list of models from all pathways.
multipathway_fusion (nn.module) – fusion model.
inplace (bool) – If inplace, directly update the input list without making a copy.

Return type

None