"git@developer.sourcefind.cn:OpenDAS/pytorch3d.git" did not exist on "14bd5e28e8a0c93a82ae4e2152e85150dfcde6c7"
Unverified Commit 9b432d07 authored by Vasilis Vryniotis's avatar Vasilis Vryniotis Committed by GitHub
Browse files

Update S3D weights (#6537)

* S3D weight deployment

* Update accuracies.

* Address review comments.
parent 8e0d1b95
...@@ -81,6 +81,7 @@ Video resnet models: ...@@ -81,6 +81,7 @@ Video resnet models:
``` ```
# number of frames per clip # number of frames per clip
--clip_len 16 \ --clip_len 16 \
--frame-rate 15 \
# allow for temporal jittering # allow for temporal jittering
--clips_per_video 5 \ --clips_per_video 5 \
--batch-size 24 \ --batch-size 24 \
...@@ -97,6 +98,21 @@ Video resnet models: ...@@ -97,6 +98,21 @@ Video resnet models:
--val-crop-size 112 112 --val-crop-size 112 112
``` ```
### S3D
The S3D model was trained similarly to the above but with the following changes on the default configuration:
```
--batch-size=12 --lr 0.2 --clip-len 64 --clips-per-video 5 --sync-bn \
--train-resize-size 256 256 --train-crop-size 224 224 --val-resize-size 256 256 --val-crop-size 224 224
```
We used 64 GPUs to train the architecture.
To estimate the validation statistics of the model, we run the reference script with the following configuration:
```
--batch-size=16 --test-only --clip-len 128 --clips-per-video 1
```
### Additional video modelling resources ### Additional video modelling resources
- [Video Model Zoo](https://github.com/facebookresearch/VMZ) - [Video Model Zoo](https://github.com/facebookresearch/VMZ)
......
...@@ -104,7 +104,7 @@ class S3D(nn.Module): ...@@ -104,7 +104,7 @@ class S3D(nn.Module):
def __init__( def __init__(
self, self,
num_classes: int = 400, num_classes: int = 400,
dropout: float = 0.0, dropout: float = 0.2,
norm_layer: Optional[Callable[..., torch.nn.Module]] = None, norm_layer: Optional[Callable[..., torch.nn.Module]] = None,
) -> None: ) -> None:
super().__init__() super().__init__()
...@@ -153,28 +153,26 @@ class S3D(nn.Module): ...@@ -153,28 +153,26 @@ class S3D(nn.Module):
class S3D_Weights(WeightsEnum): class S3D_Weights(WeightsEnum):
KINETICS400_V1 = Weights( KINETICS400_V1 = Weights(
url="https://download.pytorch.org/models/s3d-1bd8ae63.pth", url="https://download.pytorch.org/models/s3d-d76dad2f.pth",
transforms=partial( transforms=partial(
VideoClassification, VideoClassification,
crop_size=(224, 224), crop_size=(224, 224),
resize_size=(256, 256), resize_size=(256, 256),
mean=(0.5, 0.5, 0.5),
std=(0.5, 0.5, 0.5),
), ),
meta={ meta={
"min_size": (224, 224), "min_size": (224, 224),
"min_temporal_size": 14, "min_temporal_size": 14,
"categories": _KINETICS400_CATEGORIES, "categories": _KINETICS400_CATEGORIES,
"recipe": "https://github.com/pytorch/vision/pull/6412#issuecomment-1219687434", "recipe": "https://github.com/pytorch/vision/tree/main/references/video_classification#s3d",
"_docs": ( "_docs": (
"The weights are ported from a community repository. The accuracies are estimated on clip-level " "The weights aim to approximate the accuracy of the paper. The accuracies are estimated on clip-level "
"with parameters `frame_rate=15`, `clips_per_video=1`, and `clip_len=128`." "with parameters `frame_rate=15`, `clips_per_video=1`, and `clip_len=128`."
), ),
"num_params": 8320048, "num_params": 8320048,
"_metrics": { "_metrics": {
"Kinetics-400": { "Kinetics-400": {
"acc@1": 67.315, "acc@1": 68.368,
"acc@5": 87.593, "acc@5": 88.050,
} }
}, },
}, },
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment