Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
huaerkl
fairseq-data2vec_pytorch
Commits
72f5785f
Commit
72f5785f
authored
Aug 15, 2023
by
huaerkl
Browse files
v1.0
parents
Pipeline
#505
canceled with stages
Changes
508
Pipelines
1
Hide whitespace changes
Inline
Side-by-side
Showing
20 changed files
with
758 additions
and
0 deletions
+758
-0
examples/MMPT/projects/mtm/vlm/test_vtt.yaml
examples/MMPT/projects/mtm/vlm/test_vtt.yaml
+29
-0
examples/MMPT/projects/mtm/vlm/test_vttqa.yaml
examples/MMPT/projects/mtm/vlm/test_vttqa.yaml
+29
-0
examples/MMPT/projects/mtm/vlm/test_youcook.yaml
examples/MMPT/projects/mtm/vlm/test_youcook.yaml
+31
-0
examples/MMPT/projects/mtm/vlm/test_youcookcap.yaml
examples/MMPT/projects/mtm/vlm/test_youcookcap.yaml
+32
-0
examples/MMPT/projects/mtm/vlm/vtt.yaml
examples/MMPT/projects/mtm/vlm/vtt.yaml
+49
-0
examples/MMPT/projects/mtm/vlm/vttqa.yaml
examples/MMPT/projects/mtm/vlm/vttqa.yaml
+47
-0
examples/MMPT/projects/mtm/vlm/youcook.yaml
examples/MMPT/projects/mtm/vlm/youcook.yaml
+47
-0
examples/MMPT/projects/mtm/vlm/youcookcap.yaml
examples/MMPT/projects/mtm/vlm/youcookcap.yaml
+45
-0
examples/MMPT/projects/retri/videoclip.yaml
examples/MMPT/projects/retri/videoclip.yaml
+10
-0
examples/MMPT/projects/retri/videoclip/coin_videoclip.yaml
examples/MMPT/projects/retri/videoclip/coin_videoclip.yaml
+49
-0
examples/MMPT/projects/retri/videoclip/crosstask_videoclip.yaml
...es/MMPT/projects/retri/videoclip/crosstask_videoclip.yaml
+55
-0
examples/MMPT/projects/retri/videoclip/how2.yaml
examples/MMPT/projects/retri/videoclip/how2.yaml
+65
-0
examples/MMPT/projects/retri/videoclip/test_coin_videoclip.yaml
...es/MMPT/projects/retri/videoclip/test_coin_videoclip.yaml
+33
-0
examples/MMPT/projects/retri/videoclip/test_coin_zs.yaml
examples/MMPT/projects/retri/videoclip/test_coin_zs.yaml
+33
-0
examples/MMPT/projects/retri/videoclip/test_crosstask_videoclip.yaml
...PT/projects/retri/videoclip/test_crosstask_videoclip.yaml
+40
-0
examples/MMPT/projects/retri/videoclip/test_crosstask_zs_videoclip.yaml
...projects/retri/videoclip/test_crosstask_zs_videoclip.yaml
+40
-0
examples/MMPT/projects/retri/videoclip/test_didemo_zs.yaml
examples/MMPT/projects/retri/videoclip/test_didemo_zs.yaml
+31
-0
examples/MMPT/projects/retri/videoclip/test_vtt_videoclip.yaml
...les/MMPT/projects/retri/videoclip/test_vtt_videoclip.yaml
+31
-0
examples/MMPT/projects/retri/videoclip/test_vtt_zs.yaml
examples/MMPT/projects/retri/videoclip/test_vtt_zs.yaml
+31
-0
examples/MMPT/projects/retri/videoclip/test_vttqa_videoclip.yaml
...s/MMPT/projects/retri/videoclip/test_vttqa_videoclip.yaml
+31
-0
No files found.
Too many changes to show.
To preserve performance only
508 of 508+
files are displayed.
Plain diff
Email patch
examples/MMPT/projects/mtm/vlm/test_vtt.yaml
0 → 100644
View file @
72f5785f
slurm_config
:
big
task_type
:
local_predict
dataset
:
split
:
test
video_processor
:
VideoProcessor
aligner
:
DSAligner
bert_name
:
bert-base-uncased
meta_processor
:
MSRVTTMetaProcessor
test_path
:
data/msrvtt/MSRVTT_JSFUSION_test.csv
vfeat_dir
:
data/feat/feat_vtt_s3d
text_processor
:
MSRVTTTextProcessor
num_iso_layer
:
12
max_video_len
:
32
max_len
:
96
fairseq
:
dataset
:
batch_size
:
256
valid_subset
:
test
num_workers
:
2
common_eval
:
path
:
runs/mtm/vlm/vtt/checkpoint_last.pt
model
:
model_cls
:
MMFusionJoint
mm_encoder_cls
:
MMBertForJoint
use_seg_emb
:
true
eval
:
save_path
:
runs/mtm/vlm/vtt/eval
metric
:
RetrievalMetric
predictor
:
RetrievalPredictor
examples/MMPT/projects/mtm/vlm/test_vttqa.yaml
0 → 100644
View file @
72f5785f
slurm_config
:
big
task_type
:
local_predict
dataset
:
split
:
test
video_processor
:
VideoProcessor
aligner
:
MSRVTTQAAligner
bert_name
:
bert-base-uncased
meta_processor
:
MSRVTTQAMetaProcessor
test_path
:
data/msrvtt-qa/MSR_MC_test.csv
vfeat_dir
:
data/feat/feat_vtt_s3d
text_processor
:
MSRVTTQATextProcessor
num_iso_layer
:
12
max_video_len
:
32
max_len
:
96
fairseq
:
dataset
:
batch_size
:
256
valid_subset
:
test
num_workers
:
2
common_eval
:
path
:
runs/mtm/vlm/vttqa/checkpoint_last.pt
model
:
model_cls
:
MMFusionJoint
mm_encoder_cls
:
MMBertForJoint
use_seg_emb
:
true
eval
:
save_path
:
runs/mtm/vlm/vttqa/eval
metric
:
QAMetric
predictor
:
QAPredictor
examples/MMPT/projects/mtm/vlm/test_youcook.yaml
0 → 100644
View file @
72f5785f
slurm_config
:
big
task_type
:
local_predict
dataset
:
split
:
test
video_processor
:
YoucookVideoProcessor
aligner
:
DSAligner
bert_name
:
bert-base-uncased
meta_processor
:
YoucookMetaProcessor
test_path
:
data/youcook/youcook_val.pkl
trainval_annotation
:
data/youcook/youcookii_annotations_trainval.json
use_annotation_text
:
true
vfeat_dir
:
data/feat/feat_youcook_s3d
text_processor
:
TextProcessor
num_iso_layer
:
12
max_video_len
:
32
max_len
:
96
fairseq
:
dataset
:
batch_size
:
256
valid_subset
:
test
num_workers
:
2
common_eval
:
path
:
runs/mtm/vlm/youcook/checkpoint_last.pt
model
:
model_cls
:
MMFusionJoint
mm_encoder_cls
:
MMBertForJoint
use_seg_emb
:
true
eval
:
save_path
:
runs/mtm/vlm/youcook/eval
metric
:
RetrievalMetric
predictor
:
RetrievalPredictor
examples/MMPT/projects/mtm/vlm/test_youcookcap.yaml
0 → 100644
View file @
72f5785f
slurm_config
:
big
task_type
:
local_predict
dataset
:
split
:
test
video_processor
:
YoucookVideoProcessor
aligner
:
DSNLGAligner
bert_name
:
bert-base-uncased
meta_processor
:
YoucookNLGMetaProcessor
test_path
:
data/youcook/val_list.txt
trainval_annotation
:
data/youcook/youcookii_annotations_trainval.json
vfeat_dir
:
data/feat/feat_youcook_s3d
text_processor
:
NLGTextProcessor
max_video_len
:
32
max_len
:
96
fairseq
:
dataset
:
batch_size
:
256
valid_subset
:
test
num_workers
:
2
common_eval
:
path
:
runs/mtm/vlm/youcookcap/checkpoint_best.pt
model
:
model_cls
:
MMFusionNLG
mm_encoder_cls
:
MMBertForNLG
max_decode_length
:
24
use_seg_emb
:
true
eval
:
save_path
:
runs/mtm/vlm/youcookcap/eval
metric
:
NLGMetric
predictor
:
NLGPredictor
gen_param
:
num_beams
:
5
examples/MMPT/projects/mtm/vlm/vtt.yaml
0 → 100644
View file @
72f5785f
dataset
:
video_processor
:
VideoProcessor
bert_name
:
bert-base-uncased
meta_processor
:
MSRVTTMetaProcessor
train_path
:
data/msrvtt/MSRVTT_train.csv
jsfusion_path
:
data/msrvtt/MSRVTT_JSFUSION_test.csv
full_test_path
:
data/msrvtt/MSRVTT_FULL_test.csv
dup
:
20
val_path
:
data/msrvtt/MSRVTT_JSFUSION_test.csv
vfeat_dir
:
data/feat/feat_vtt_s3d
text_processor
:
MSRVTTTextProcessor
json_path
:
data/msrvtt/MSRVTT_data.json
aligner
:
DSAligner
num_iso_layer
:
12
max_video_len
:
32
max_len
:
96
fairseq
:
common
:
tensorboard_logdir
:
run
log_interval
:
1000
fp16
:
true
dataset
:
num_workers
:
4
batch_size
:
256
optimization
:
lr
:
-
5.0e-05
clip_norm
:
2.0
optimizer
:
adam
adam_betas
:
(0.9, 0.98)
lr_scheduler
:
polynomial_decay
total_num_update
:
1000000
warmup_updates
:
122
weight_decay
:
0.0
ddp_backend
:
no_c10d
max_epoch
:
10
checkpoint
:
restore_file
:
runs/mtm/vlm/checkpoint_best.pt
reset_optimizer
:
true
reset_dataloader
:
true
reset_meters
:
true
save_dir
:
runs/mtm/vlm/vtt
task_type
:
sweep_small
model
:
model_cls
:
MMFusionJoint
mm_encoder_cls
:
MMBertForJoint
use_seg_emb
:
true
loss
:
loss_cls
:
T2VContraLoss
examples/MMPT/projects/mtm/vlm/vttqa.yaml
0 → 100644
View file @
72f5785f
dataset
:
video_processor
:
VideoProcessor
bert_name
:
bert-base-uncased
meta_processor
:
MSRVTTMetaProcessor
train_path
:
data/msrvtt/MSRVTT_train.csv
dup
:
20
val_path
:
data/msrvtt/MSRVTT_JSFUSION_test.csv
vfeat_dir
:
data/feat/feat_vtt_s3d
text_processor
:
MSRVTTTextProcessor
json_path
:
data/msrvtt/MSRVTT_data.json
aligner
:
DSAligner
num_iso_layer
:
12
max_video_len
:
32
max_len
:
96
fairseq
:
common
:
tensorboard_logdir
:
run
log_interval
:
1000
fp16
:
true
dataset
:
num_workers
:
4
batch_size
:
128
optimization
:
lr
:
-
5.0e-05
clip_norm
:
2.0
optimizer
:
adam
adam_betas
:
(0.9, 0.98)
lr_scheduler
:
polynomial_decay
total_num_update
:
1000000
warmup_updates
:
122
weight_decay
:
0.0
ddp_backend
:
no_c10d
max_epoch
:
5
checkpoint
:
restore_file
:
runs/mtm/vlm/checkpoint_best.pt
reset_optimizer
:
true
reset_dataloader
:
true
reset_meters
:
true
save_dir
:
runs/mtm/vlm/vttqa
task_type
:
sweep_small
model
:
model_cls
:
MMFusionJoint
mm_encoder_cls
:
MMBertForJoint
use_seg_emb
:
true
loss
:
loss_cls
:
V2TContraLoss
examples/MMPT/projects/mtm/vlm/youcook.yaml
0 → 100644
View file @
72f5785f
dataset
:
video_processor
:
YoucookVideoProcessor
bert_name
:
bert-base-uncased
meta_processor
:
YoucookMetaProcessor
train_path
:
data/youcook/youcook_train.pkl
val_path
:
data/youcook/youcook_val.pkl
trainval_annotation
:
data/youcook/youcookii_annotations_trainval.json
use_annotation_text
:
true
vfeat_dir
:
data/feat/feat_youcook_s3d
text_processor
:
TextProcessor
aligner
:
DSAligner
num_iso_layer
:
12
max_video_len
:
32
max_len
:
96
fairseq
:
common
:
tensorboard_logdir
:
run
log_interval
:
1000
fp16
:
true
dataset
:
num_workers
:
4
batch_size
:
128
optimization
:
lr
:
-
5.0e-05
clip_norm
:
2.0
optimizer
:
adam
adam_betas
:
(0.9, 0.98)
lr_scheduler
:
polynomial_decay
total_num_update
:
1000000
warmup_updates
:
122
weight_decay
:
0.0
ddp_backend
:
no_c10d
max_epoch
:
10
checkpoint
:
restore_file
:
runs/mtm/vlm/checkpoint_best.pt
reset_optimizer
:
true
reset_dataloader
:
true
reset_meters
:
true
save_dir
:
runs/mtm/vlm/youcook
task_type
:
sweep_small
model
:
model_cls
:
MMFusionJoint
mm_encoder_cls
:
MMBertForJoint
use_seg_emb
:
true
loss
:
loss_cls
:
T2VContraLoss
examples/MMPT/projects/mtm/vlm/youcookcap.yaml
0 → 100644
View file @
72f5785f
dataset
:
video_processor
:
YoucookVideoProcessor
bert_name
:
bert-base-uncased
meta_processor
:
YoucookNLGMetaProcessor
train_path
:
data/youcook/train_list.txt
val_path
:
data/youcook/val_list.txt
trainval_annotation
:
data/youcook/youcookii_annotations_trainval.json
vfeat_dir
:
data/feat/feat_youcook_s3d
text_processor
:
NLGTextProcessor
aligner
:
DSNLGAligner
max_video_len
:
32
max_len
:
96
fairseq
:
common
:
tensorboard_logdir
:
run
log_interval
:
1000
fp16
:
true
dataset
:
num_workers
:
4
batch_size
:
128
optimization
:
lr
:
-
5.0e-05
clip_norm
:
2.0
optimizer
:
adam
adam_betas
:
(0.9, 0.98)
lr_scheduler
:
polynomial_decay
total_num_update
:
1000000
warmup_updates
:
122
weight_decay
:
0.0
ddp_backend
:
no_c10d
max_epoch
:
10
checkpoint
:
restore_file
:
runs/mtm/vlm/checkpoint_best.pt
reset_optimizer
:
true
reset_dataloader
:
true
reset_meters
:
true
save_dir
:
runs/mtm/vlm/youcookcap
task_type
:
sweep_small
model
:
model_cls
:
MMFusionNLG
mm_encoder_cls
:
MMBertForNLG
use_seg_emb
:
true
loss
:
loss_cls
:
NLGLoss
examples/MMPT/projects/retri/videoclip.yaml
0 → 100644
View file @
72f5785f
includes
:
projects/retri/videoretri.yaml
project_dir
:
retri/videoclip
task_group
:
pretrain
:
model
:
model_cls
:
MMFusionSeparate
mm_encoder_cls
:
video_encoder_cls
:
MMBertForEncoder
text_encoder_cls
:
BertModel
num_hidden_video_layers
:
6
examples/MMPT/projects/retri/videoclip/coin_videoclip.yaml
0 → 100644
View file @
72f5785f
dataset
:
video_processor
:
VideoProcessor
bert_name
:
bert-base-uncased
meta_processor
:
COINActionSegmentationMetaProcessor
train_path
:
data/coin/COIN.json
val_path
:
data/coin/COIN.json
vfeat_dir
:
data/feat/feat_coin_s3d
text_processor
:
COINActionSegmentationTextProcessor
aligner
:
COINActionSegmentationAligner
num_iso_layer
:
12
sliding_window
:
8
sliding_window_size
:
32
max_video_len
:
32
max_len
:
96
fairseq
:
common
:
tensorboard_logdir
:
run
log_interval
:
1000
fp16
:
true
dataset
:
num_workers
:
4
batch_size
:
1
optimization
:
lr
:
-
5.0e-05
clip_norm
:
2.0
optimizer
:
adam
adam_betas
:
(0.9, 0.98)
lr_scheduler
:
polynomial_decay
total_num_update
:
1000000
warmup_updates
:
122
weight_decay
:
0.0
ddp_backend
:
no_c10d
max_epoch
:
8
checkpoint
:
restore_file
:
runs/retri/videoclip/checkpoint_best.pt
reset_optimizer
:
true
reset_dataloader
:
true
reset_meters
:
true
save_dir
:
runs/retri/videoclip/coin
task_type
:
sweep_big
model
:
model_cls
:
MMFusionSeparateActionSegmentation
mm_encoder_cls
:
null
video_encoder_cls
:
MMBertForTokenClassification
text_encoder_cls
:
BertModel
num_hidden_video_layers
:
6
loss
:
loss_cls
:
CrossEntropy
examples/MMPT/projects/retri/videoclip/crosstask_videoclip.yaml
0 → 100644
View file @
72f5785f
dataset
:
video_processor
:
CrossTaskVideoProcessor
bert_name
:
bert-base-uncased
meta_processor
:
CrossTaskMetaProcessor
train_path
:
data/crosstask/crosstask_release/videos.csv
train_csv_path
:
data/crosstask/crosstask_release/videos.csv
val_path
:
data/crosstask/crosstask_release/videos_val.csv
val_csv_path
:
data/crosstask/crosstask_release/videos_val.csv
primary_path
:
data/crosstask/crosstask_release/tasks_primary.txt
related_path
:
data/crosstask/crosstask_release/tasks_related.txt
vfeat_dir
:
data/feat/feat_crosstask_s3d
annotation_path
:
data/crosstask/crosstask_release/annotations
n_train
:
30
text_processor
:
CrossTaskTextProcessor
aligner
:
CrossTaskAligner
num_iso_layer
:
12
sliding_window
:
16
sliding_window_size
:
32
max_video_len
:
32
max_len
:
96
fairseq
:
common
:
tensorboard_logdir
:
run
log_interval
:
1000
fp16
:
true
dataset
:
num_workers
:
4
batch_size
:
1
optimization
:
lr
:
-
5.0e-05
clip_norm
:
2.0
optimizer
:
adam
adam_betas
:
(0.9, 0.98)
lr_scheduler
:
polynomial_decay
total_num_update
:
1000000
warmup_updates
:
122
weight_decay
:
0.0
ddp_backend
:
no_c10d
max_epoch
:
5
checkpoint
:
restore_file
:
runs/retri/videoclip/checkpoint_best.pt
reset_optimizer
:
true
reset_dataloader
:
true
reset_meters
:
true
save_dir
:
runs/retri/videoclip/crosstask
task_type
:
sweep_small
model
:
model_cls
:
MMFusionSeparateActionLocalization
mm_encoder_cls
:
null
video_encoder_cls
:
MMBertForEncoder
text_encoder_cls
:
BertModel
num_hidden_video_layers
:
6
loss
:
loss_cls
:
BCE
examples/MMPT/projects/retri/videoclip/how2.yaml
0 → 100644
View file @
72f5785f
dataset
:
video_processor
:
ShardedVideoRetriVideoProcessor
bert_name
:
bert-base-uncased
meta_processor
:
ShardedHow2VideoRetriMetaProcessor
train_path
:
data/how2/how2_s3d_train.lst
val_path
:
data/how2/how2_s3d_val.lst
vfeat_dir
:
data/feat/feat_how2_s3d_shard_small
text_processor
:
ShardedVideoRetriTextProcessor
tfeat_dir
:
data/feat/feat_how2_s3d_shard_small/raw_caption_dedup.bert-base-uncased.
aligner
:
VideoRetriOverlappedAligner
subsampling
:
1
sampled_min_len
:
8
sampled_max_len
:
64
max_video_len
:
32
max_len
:
96
lazy_vfeat_mask
:
true
mfm_probability
:
0.15
mlm_probability
:
0.15
mm_prob
:
0.5
sampled_video_min_len
:
3
sampled_video_max_len
:
32
num_video_per_batch
:
32
clip_per_video
:
16
fairseq
:
common
:
tensorboard_logdir
:
run
log_interval
:
1000
fp16
:
true
dataset
:
num_workers
:
4
batch_size
:
1
optimization
:
lr
:
-
5.0e-05
clip_norm
:
2.0
optimizer
:
adam
adam_betas
:
(0.9, 0.98)
lr_scheduler
:
polynomial_decay
total_num_update
:
1000000
warmup_updates
:
1000
weight_decay
:
0.0
ddp_backend
:
no_c10d
max_epoch
:
25
checkpoint
:
save_dir
:
runs/retri/videoclip
save_interval_updates
:
1024
keep_interval_updates
:
2
keep_last_epochs
:
30
task_type
:
sweep_big
slurm_config
:
big
eval
:
save_path
:
runs/retri/videoclip
model
:
model_cls
:
MMFusionSeparate
mm_encoder_cls
:
null
video_encoder_cls
:
MMBertForEncoder
text_encoder_cls
:
BertModel
num_hidden_video_layers
:
6
loss
:
loss_cls
:
MMContraLoss
task
:
VideoRetriTask
retri_epoch
:
1
vectorpool_cls
:
VideoVectorPool
retriever_cls
:
VectorRetriever
num_cands
:
64
examples/MMPT/projects/retri/videoclip/test_coin_videoclip.yaml
0 → 100644
View file @
72f5785f
slurm_config
:
big
task_type
:
local_predict
dataset
:
split
:
test
video_processor
:
VideoProcessor
aligner
:
COINActionSegmentationAligner
bert_name
:
bert-base-uncased
test_path
:
data/coin/COIN.json
meta_processor
:
COINActionSegmentationMetaProcessor
vfeat_dir
:
data/feat/feat_coin_s3d
text_processor
:
COINActionSegmentationTextProcessor
num_iso_layer
:
12
sliding_window
:
16
sliding_window_size
:
32
max_video_len
:
32
max_len
:
96
fairseq
:
dataset
:
batch_size
:
1
valid_subset
:
test
num_workers
:
2
common_eval
:
path
:
runs/retri/videoclip/coin/checkpoint_best.pt
model
:
model_cls
:
MMFusionSeparateActionSegmentation
mm_encoder_cls
:
null
video_encoder_cls
:
MMBertForTokenClassification
text_encoder_cls
:
BertModel
num_hidden_video_layers
:
6
eval
:
save_path
:
runs/retri/videoclip/coin/eval
metric
:
COINActionSegmentationMetric
predictor
:
COINPredictor
examples/MMPT/projects/retri/videoclip/test_coin_zs.yaml
0 → 100644
View file @
72f5785f
slurm_config
:
big
task_type
:
local_predict
dataset
:
split
:
test
video_processor
:
VideoProcessor
aligner
:
COINActionSegmentationAligner
bert_name
:
bert-base-uncased
test_path
:
data/coin/COIN.json
meta_processor
:
COINActionSegmentationMetaProcessor
vfeat_dir
:
data/feat/feat_coin_s3d
text_processor
:
COINActionSegmentationTextProcessor
num_iso_layer
:
12
sliding_window
:
16
sliding_window_size
:
32
max_video_len
:
32
max_len
:
96
fairseq
:
dataset
:
batch_size
:
1
valid_subset
:
test
num_workers
:
2
common_eval
:
path
:
runs/retri/videoclip/checkpoint_best.pt
model
:
model_cls
:
MMFusionSeparate
mm_encoder_cls
:
null
video_encoder_cls
:
MMBertForEncoder
text_encoder_cls
:
BertModel
num_hidden_video_layers
:
6
eval
:
save_path
:
runs/retri/videoclip/coin_zs/eval
metric
:
COINActionSegmentationMetric
predictor
:
COINZSPredictor
examples/MMPT/projects/retri/videoclip/test_crosstask_videoclip.yaml
0 → 100644
View file @
72f5785f
slurm_config
:
big
task_type
:
local_predict
dataset
:
split
:
test
video_processor
:
CrossTaskVideoProcessor
aligner
:
CrossTaskAligner
bert_name
:
bert-base-uncased
meta_processor
:
CrossTaskMetaProcessor
test_path
:
data/crosstask/crosstask_release/videos_val.csv
train_csv_path
:
data/crosstask/crosstask_release/videos.csv
val_path
:
data/crosstask/crosstask_release/videos_val.csv
val_csv_path
:
data/crosstask/crosstask_release/videos_val.csv
primary_path
:
data/crosstask/crosstask_release/tasks_primary.txt
related_path
:
data/crosstask/crosstask_release/tasks_related.txt
vfeat_dir
:
data/feat/feat_crosstask_s3d
annotation_path
:
data/crosstask/crosstask_release/annotations
n_train
:
30
text_processor
:
CrossTaskTextProcessor
num_iso_layer
:
12
sliding_window
:
16
sliding_window_size
:
32
max_video_len
:
32
max_len
:
96
fairseq
:
dataset
:
batch_size
:
1
valid_subset
:
test
num_workers
:
2
common_eval
:
path
:
runs/retri/videoclip/crosstask/checkpoint_best.pt
model
:
model_cls
:
MMFusionSeparateActionLocalization
mm_encoder_cls
:
null
video_encoder_cls
:
MMBertForEncoder
text_encoder_cls
:
BertModel
num_hidden_video_layers
:
6
eval
:
save_path
:
runs/retri/videoclip/crosstask/eval
metric
:
CrossTaskMetric
predictor
:
CrossTaskPredictor
examples/MMPT/projects/retri/videoclip/test_crosstask_zs_videoclip.yaml
0 → 100644
View file @
72f5785f
slurm_config
:
big
task_type
:
local_predict
dataset
:
split
:
test
video_processor
:
CrossTaskVideoProcessor
aligner
:
CrossTaskAligner
bert_name
:
bert-base-uncased
meta_processor
:
CrossTaskMetaProcessor
test_path
:
data/crosstask/crosstask_release/videos_val.csv
train_csv_path
:
data/crosstask/crosstask_release/videos.csv
val_path
:
data/crosstask/crosstask_release/videos_val.csv
val_csv_path
:
data/crosstask/crosstask_release/videos_val.csv
primary_path
:
data/crosstask/crosstask_release/tasks_primary.txt
related_path
:
data/crosstask/crosstask_release/tasks_related.txt
vfeat_dir
:
data/feat/feat_crosstask_s3d
annotation_path
:
data/crosstask/crosstask_release/annotations
n_train
:
30
text_processor
:
CrossTaskTextProcessor
num_iso_layer
:
12
sliding_window
:
16
sliding_window_size
:
32
max_video_len
:
32
max_len
:
96
fairseq
:
dataset
:
batch_size
:
1
valid_subset
:
test
num_workers
:
2
common_eval
:
path
:
runs/retri/videoclip/checkpoint_best.pt
model
:
model_cls
:
MMFusionSeparateActionLocalization
mm_encoder_cls
:
null
video_encoder_cls
:
MMBertForEncoder
text_encoder_cls
:
BertModel
num_hidden_video_layers
:
6
eval
:
save_path
:
runs/retri/videoclip/crosstask_zs/eval
metric
:
CrossTaskMetric
predictor
:
CrossTaskPredictor
examples/MMPT/projects/retri/videoclip/test_didemo_zs.yaml
0 → 100644
View file @
72f5785f
slurm_config
:
big
task_type
:
local_predict
dataset
:
split
:
test
video_processor
:
VideoProcessor
aligner
:
DiDeMoAligner
bert_name
:
bert-base-uncased
meta_processor
:
DiDeMoMetaProcessor
test_path
:
data/didemo/test_data.json
vfeat_dir
:
data/feat/feat_didemo_s3d
text_processor
:
DiDeMoTextProcessor
num_iso_layer
:
12
max_video_len
:
32
max_len
:
96
fairseq
:
dataset
:
batch_size
:
256
valid_subset
:
test
num_workers
:
2
common_eval
:
path
:
runs/retri/videoclip/checkpoint_best.pt
model
:
model_cls
:
MMFusionSeparate
mm_encoder_cls
:
null
video_encoder_cls
:
MMBertForEncoder
text_encoder_cls
:
BertModel
num_hidden_video_layers
:
6
eval
:
save_path
:
runs/retri/videoclip/didemo_zs/eval
metric
:
DiDeMoMetric
predictor
:
DiDeMoPredictor
examples/MMPT/projects/retri/videoclip/test_vtt_videoclip.yaml
0 → 100644
View file @
72f5785f
slurm_config
:
big
task_type
:
local_predict
dataset
:
split
:
test
video_processor
:
VideoProcessor
aligner
:
DSAligner
bert_name
:
bert-base-uncased
meta_processor
:
MSRVTTMetaProcessor
test_path
:
data/msrvtt/MSRVTT_JSFUSION_test.csv
vfeat_dir
:
data/feat/feat_vtt_s3d
text_processor
:
MSRVTTTextProcessor
num_iso_layer
:
12
max_video_len
:
32
max_len
:
96
fairseq
:
dataset
:
batch_size
:
256
valid_subset
:
test
num_workers
:
2
common_eval
:
path
:
runs/retri/videoclip/vtt/checkpoint_last.pt
model
:
model_cls
:
MMFusionSeparate
mm_encoder_cls
:
null
video_encoder_cls
:
MMBertForEncoder
text_encoder_cls
:
BertModel
num_hidden_video_layers
:
6
eval
:
save_path
:
runs/retri/videoclip/vtt/eval
metric
:
RetrievalMetric
predictor
:
RetrievalPredictor
examples/MMPT/projects/retri/videoclip/test_vtt_zs.yaml
0 → 100644
View file @
72f5785f
slurm_config
:
big
task_type
:
local_predict
dataset
:
split
:
test
video_processor
:
VideoProcessor
aligner
:
DSAligner
bert_name
:
bert-base-uncased
meta_processor
:
MSRVTTMetaProcessor
test_path
:
data/msrvtt/MSRVTT_JSFUSION_test.csv
vfeat_dir
:
data/feat/feat_vtt_s3d
text_processor
:
MSRVTTTextProcessor
num_iso_layer
:
12
max_video_len
:
32
max_len
:
96
fairseq
:
dataset
:
batch_size
:
256
valid_subset
:
test
num_workers
:
2
common_eval
:
path
:
runs/retri/videoclip/checkpoint_best.pt
model
:
model_cls
:
MMFusionSeparate
mm_encoder_cls
:
null
video_encoder_cls
:
MMBertForEncoder
text_encoder_cls
:
BertModel
num_hidden_video_layers
:
6
eval
:
save_path
:
runs/retri/videoclip/vtt_zs/eval
metric
:
RetrievalMetric
predictor
:
RetrievalPredictor
examples/MMPT/projects/retri/videoclip/test_vttqa_videoclip.yaml
0 → 100644
View file @
72f5785f
slurm_config
:
big
task_type
:
local_predict
dataset
:
split
:
test
video_processor
:
VideoProcessor
aligner
:
MSRVTTQAAligner
bert_name
:
bert-base-uncased
meta_processor
:
MSRVTTQAMetaProcessor
test_path
:
data/msrvtt-qa/MSR_MC_test.csv
vfeat_dir
:
data/feat/feat_vtt_s3d
text_processor
:
MSRVTTQATextProcessor
num_iso_layer
:
12
max_video_len
:
32
max_len
:
96
fairseq
:
dataset
:
batch_size
:
256
valid_subset
:
test
num_workers
:
2
common_eval
:
path
:
runs/retri/videoclip/vttqa/checkpoint_last.pt
model
:
model_cls
:
MMFusionSeparate
mm_encoder_cls
:
null
video_encoder_cls
:
MMBertForEncoder
text_encoder_cls
:
BertModel
num_hidden_video_layers
:
6
eval
:
save_path
:
runs/retri/videoclip/vttqa/eval
metric
:
QAMetric
predictor
:
QAPredictor
Prev
1
2
3
4
5
6
7
8
9
10
…
26
Next
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment