Unverified Commit 1fbaa3c1 authored by Anthony MOI's avatar Anthony MOI Committed by GitHub
Browse files

Fix tokenizers training in notebook (#10110)

parent 85395e49
...@@ -229,7 +229,7 @@ ...@@ -229,7 +229,7 @@
"\n", "\n",
"# We initialize our trainer, giving him the details about the vocabulary we want to generate\n", "# We initialize our trainer, giving him the details about the vocabulary we want to generate\n",
"trainer = BpeTrainer(vocab_size=25000, show_progress=True, initial_alphabet=ByteLevel.alphabet())\n", "trainer = BpeTrainer(vocab_size=25000, show_progress=True, initial_alphabet=ByteLevel.alphabet())\n",
"tokenizer.train(trainer, [\"big.txt\"])\n", "tokenizer.train(files=[\"big.txt\"], trainer=trainer)\n",
"\n", "\n",
"print(\"Trained vocab size: {}\".format(tokenizer.get_vocab_size()))" "print(\"Trained vocab size: {}\".format(tokenizer.get_vocab_size()))"
] ]
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment