-
Patrick von Platen authored
Add preprocessing step for transfo-xl tokenization to avoid tokenizing words followed by punction to <unk> (#2987) * add preprocessing to add space before punctuation for transfo_xl * improve warning messages * make style * compile regex at instantination of tokenizer object
65d74c49