Commit 6b0cce84 authored by Jingfei Du's avatar Jingfei Du Committed by Facebook Github Bot
Browse files

fix bug for masking prob (#758)

Summary:
Pull Request resolved: https://github.com/pytorch/fairseq/pull/758

Pull Request resolved: https://github.com/fairinternal/fairseq-py/pull/603

fixed a typo for _mask_block of mlm. This typo will make we never set masked token as random token, which should take 10% of the masked tokens.

Reviewed By: akinh

Differential Revision: D15492315

fbshipit-source-id: 1e03dc862e23a6543e51d7401c74608d366ba62d
parent 6b3a516f
......@@ -166,7 +166,7 @@ class MaskedLMDataset(FairseqDataset):
# replace with random token if probability is less than
# masking_prob + random_token_prob (Eg: 0.9)
elif rand < (self.masking_ratio + self.random_token_prob):
elif rand < (self.masking_prob + self.random_token_prob):
# sample random token from dictionary
masked_sent[i] = (
np.random.randint(
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment