- 21 Dec, 2021 3 commits
-
-
Sylvain Gugger authored
-
Stas Bekman authored
* [examples/summarization] deal with None in data records * rewrite to use a simpler (slower) variant
-
Patrick von Platen authored
* up * load up * up
-
- 17 Dec, 2021 1 commit
-
-
Patrick von Platen authored
* up * add tokenizer * improve more * finish tokenizer * finish * adapt speech recognition script * adapt convert * more fixes * more fixes * update phonemizer wav2vec2 * better naming * fix more tests * more fixes swedish * correct tests * finish * improve script * remove file * up * lets get those 100 model architectures until the end of the month * make fix-copies * correct more * correct script * more fixes * more fixes * add to docs * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * replace assert * fix copies * fix docs * new try docs * boom boom * update * add phonemizer to audio tests * make fix-copies * up * upload models * some changes * Update tests/test_tokenization_wav2vec2_phoneme.py Co-authored-by:
Anton Lozhkov <aglozhkov@gmail.com> * more fixes * remove @ Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Anton Lozhkov <aglozhkov@gmail.com>
-
- 15 Dec, 2021 2 commits
- 13 Dec, 2021 1 commit
-
-
Josu茅 Nascimento authored
-
- 09 Dec, 2021 2 commits
- 08 Dec, 2021 1 commit
-
-
Gaurang Tandon authored
* fix: verify jsonl in run_translation (#14660) * fix(run_translation.py): json/jsonl validation Both json and jsonl are to be accepted as valid jsonlines file extension * fix(run_translation.py): make black happy * Ran make style
-
- 06 Dec, 2021 2 commits
-
-
Julien Chaumond authored
* Replace outdated model tags with their now-canonical pipeline types * spam the CI till it's green
-
Kamal Raj authored
-
- 05 Dec, 2021 1 commit
-
-
(Bill) Yuchen Lin authored
-
- 22 Nov, 2021 2 commits
-
-
Nicholas Broad authored
* remove sum for list flattening * change to chain(*) * make chain object a list * delete empty lines per sgugger's suggestions Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Nicholas Broad <nicholas@nmbroad.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Stas Bekman authored
* add test for --config_overrides * remove unneeded parts of the test
-
- 18 Nov, 2021 2 commits
-
-
Patrick von Platen authored
Add more XLS-R training runs to the official examples
-
William Held authored
-
- 12 Nov, 2021 1 commit
-
-
Patrick von Platen authored
* improve some stuff * finish * correct last
-
- 09 Nov, 2021 1 commit
-
-
karthikrangasai authored
* Update postporcessing accordingly to use SQuAD metric. * Update assets accordingly based on SQuAD metrics. * Fix function naming error.
-
- 05 Nov, 2021 1 commit
-
-
Sylvain Gugger authored
-
- 01 Nov, 2021 1 commit
-
-
NielsRogge authored
-
- 28 Oct, 2021 5 commits
-
-
Patrick von Platen authored
-
Lysandre authored
-
Lysandre authored
-
Anton Lozhkov authored
-
Patrick von Platen authored
-
- 27 Oct, 2021 2 commits
-
-
Patrick von Platen authored
-
Patrick von Platen authored
* up * up * fix * up * Update examples/pytorch/test_xla_examples.py * correct labels * up * up * up * up * up * up
-
- 26 Oct, 2021 6 commits
-
-
Emanuel Huber authored
Updated masked-language modeling examples in pytorch with convention defined by #12789
-
Matthew Goldey authored
* specify the text column name in the error message * pluralize the word fields
-
Jangwon Park authored
-
Patrick von Platen authored
-
Patrick von Platen authored
[Speech Recognition] - Distributed training: Make sure vocab file removal and creation don't interfer (#14161) * up * better
-
Patrick von Platen authored
-
- 25 Oct, 2021 3 commits
-
-
Patrick von Platen authored
-
Patrick von Platen authored
-
karthikrangasai authored
* Add seq2seq example for QnA on SQuAD Dataset. * Changes from review - Fixing styling mistakes. * Added how to example in README, simplified the access to dataset's preprocess function. * Added tests for the seq2seq QA example. * Change dataset column name to fix tests. * Fix test command mistake. * Add missing argument 'ignore_pad_token_for_loss' from DataTrainingArguments. * Add missing argument 'num_beams' from DataTrainingArguments. * Fix processing of output predicted token ids so that tokenizer decode gets appropriate input. Updated assertion conditions on the tests.
-
- 21 Oct, 2021 3 commits
-
-
lee1jun authored
last line: "# limitations under the License." is missing
-
Anton Lozhkov authored
* Update SEW integration test tolerance * Add audio classification notebooks
-
Patrick von Platen authored
-