- 20 Sep, 2021 4 commits
-
-
Ayaka Mikazuki authored
* Fix MT5 documentation The abstract is incomplete * MT5 -> mT5
-
Chengjiang Li authored
-
Gunjan Chhablani authored
* Init FNet * Update config * Fix config * Update model classes * Update tokenizers to use sentencepiece * Fix errors in model * Fix defaults in config * Remove position embedding type completely * Fix typo and take only real numbers * Fix type vocab size in configuration * Add projection layer to embeddings * Fix position ids bug in embeddings * Add minor changes * Add conversion script and remove CausalLM vestiges * Fix conversion script * Fix conversion script * Remove CausalLM Test * Update checkpoint names to dummy checkpoints * Add tokenizer mapping * Fix modeling file and corresponding tests * Add tokenization test file * Add PreTraining model test * Make style and quality * Make tokenization base tests work * Update docs * Add FastTokenizer tests * Fix fast tokenizer special tokens * Fix style and quality * Remove load_tf_weights vestiges * Add FNet to main README * Fix configuration example indentation * Comment tokenization slow test * Fix style * Add changes from review * Fix style * Remove bos and eos tokens from tokenizers * Add tokenizer slow test, TPU transforms, NSP * Add scipy check * Add scipy availabilty check to test * Fix tokenizer and use correct inputs * Remove remaining TODOs * Fix tests * Fix tests * Comment Fourier Test * Uncomment Fourier Test * Change to google checkpoint * Add changes from review * Fix activation function * Fix model integration test * Add more integration tests * Add comparison steps to MLM integration test * Fix style * Add masked tokenization fix * Improve mask tokenization fix * Fix index docs * Add changes from review * Fix issue * Fix failing import in test * some more fixes * correct fast tokenizer * finalize * make style * Remove additional tokenization logic * Set do_lower_case to False * Allow keeping accents * Fix tokenization test * Fix FNet Tokenizer Fast * fix tests * make style * Add tips to FNet docs Co-authored-by:patrickvonplaten <patrick.v.platen@gmail.com>
-
Suraj Patil authored
-
- 17 Sep, 2021 9 commits
-
-
calpt authored
-
Lysandre Debut authored
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Alessandro Suglia authored
Co-authored-by:Alessandro Suglia <asuglia@fb.com>
-
Alex Hedges authored
-
Matt authored
* Removed misfiring warnings * Revert "Removed misfiring warnings" This reverts commit cea90de325056b9c1cbcda2bd2613a785c1639ce. * Retain the warning, but only when the user actually overrides things * Fix accidentally breaking just about every model on the hub simultaneously * Style pass
-
Li-Huai (Allan) Lin authored
* Fix special tokens not correctly tokenized * Add testing * Fix * Fix * Use user workflows instead of directly assigning variables * Enable test of fast tokenizers * Update test of canine tokenizer
-
Patrick von Platen authored
* finish * add test * push * remove unnecessary code * up * correct test * Update src/transformers/training_args.py
-
Ibraheem Moosa authored
* Optimize Token Classification models for TPU As per the XLA document XLA cannot handle masked indexing well. So token classification models for BERT and others use an implementation based on `torch.where`. This implementation works well on TPU. ALBERT token classification model uses the masked indexing which causes performance issues on TPU. This PR fixes this issue by following the BERT implementation. * Same fix for ELECTRA * Same fix for LayoutLM
-
- 16 Sep, 2021 11 commits
-
-
Benjamin Davidson authored
* made tokenizer fully picklable * remove whitespace * added testcase
-
Sylvain Gugger authored
* Properly use test_fetcher for examples * Fake example modification * Fake modeling file modification * Clean fake modifications * Run example tests for any modification.
-
Stas Bekman authored
* [deepspeed] replaced deprecated init arg * Trigger CI
-
Patrick von Platen authored
* correct * add tests * Update src/transformers/feature_extraction_sequence_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Matt authored
* Fix issue when labels are supplied as Numpy array instead of list * Fix issue when labels are supplied as Numpy array instead of list * Fix same issue in the `TokenClassification` data collator * Style pass
-
Sylvain Gugger authored
-
Lysandre Debut authored
-
Matt authored
* Fix issue when labels are supplied as Numpy array instead of list * Fix issue when labels are supplied as Numpy array instead of list
-
Patrick von Platen authored
-
Patrick von Platen authored
-
Stas Bekman authored
-
- 15 Sep, 2021 4 commits
-
-
Patrick von Platen authored
* finish * delete bogus file * correct some stuff * finish * finish
-
elishowk authored
-
Suraj Patil authored
Update GPT Neo ONNX config to match the changes implied by the simplification of the local attention Co-authored-by:Michael Benayoun <michael@huggingface.co>
-
Bhadresh Savani authored
-
- 14 Sep, 2021 8 commits
-
-
Sylvain Gugger authored
* Fix test_fetcher when setup is updated * Remove example
-
elishowk authored
Co-authored-by:Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sylvain Gugger authored
-
Sylvain Gugger authored
* Add checks to build cleaner model cards * Address review comments
-
Bhadresh Savani authored
* added initial files * fixes pipeline * fixes style and quality * fixes doc issue and positional encoding * fixes layer norm and test * fixes quality issue * fixes code quality * removed extra layer norm * added layer norm back in encoder and decoder * added more code copy quality checks * update tests * Apply suggestions from code review * fix import * fix test Co-authored-by:patil-suraj <surajp815@gmail.com>
-
Suraj Patil authored
-
Sylvain Gugger authored
* Push to hub when saving checkpoints * Add model card * Revert partial model card * Small fix for checkpoint * Add tests * Add documentation * Fix tests * Bump huggingface_hub * Fix test
-
Avital Oliver authored
* Add long-overdue link to the Google TRC project * Apply suggestions from code review Co-authored-by:
Suraj Patil <surajp815@gmail.com> Co-authored-by:
Stefan Schweter <stefan@schweter.it>
-
- 13 Sep, 2021 4 commits
-
-
Lysandre Debut authored
* Nightly CI torch * Version * Reformat * Only subset Fix * Revert * Better formatting * New channel
-
Patrick von Platen authored
-
SaulLu authored
* add imports * Update docs/source/perplexity.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Stas Bekman authored
* [tokenizer] use use_auth_token for config * args order
-