Unverified Commit e52d1258 authored by Ethan Perez's avatar Ethan Perez Committed by GitHub
Browse files

Fix RoBERTa/XLNet Pad Token in run_multiple_choice.py (#3631)

* Fix RoBERTa/XLNet Pad Token in run_multiple_choice.py

`convert_examples_to_fes atures` sets `pad_token=0` by default, which is correct for BERT but incorrect for RoBERTa (`pad_token=1`) and XLNet (`pad_token=5`). I think the other arguments to `convert_examples_to_features` are correct, but it might be helpful if someone checked who is more familiar with this part of the codebase.

* Simplifying change to match recent commits
parent 0ac33ddd
...@@ -361,6 +361,7 @@ def load_and_cache_examples(args, task, tokenizer, evaluate=False, test=False): ...@@ -361,6 +361,7 @@ def load_and_cache_examples(args, task, tokenizer, evaluate=False, test=False):
args.max_seq_length, args.max_seq_length,
tokenizer, tokenizer,
pad_on_left=bool(args.model_type in ["xlnet"]), # pad on the left for xlnet pad_on_left=bool(args.model_type in ["xlnet"]), # pad on the left for xlnet
pad_token=tokenizer.pad_token_id,
pad_token_segment_id=tokenizer.pad_token_type_id, pad_token_segment_id=tokenizer.pad_token_type_id,
) )
if args.local_rank in [-1, 0]: if args.local_rank in [-1, 0]:
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment