otherwise, `rp_bucket` will always be on cpu and fail if `self.relative_attention_bias` is on cuda
Attach a file by drag & drop or click to upload