- 31 May, 2021 1 commit
-
-
Lysandre authored
-
- 28 May, 2021 3 commits
-
-
Lysandre Debut authored
-
Jayendra authored
* Added logic to return attention from flax-bert model and added test cases to check that * Added new line at the end of file to test_modeling_flax_common.py * fixing code style * Fixing Roberta and Elextra models too from cpoying bert * Added temporary hack to not run test_attention_outputs for FlaxGPT2 * Returning attention weights from GPT2 and changed the tests accordingly. * last fixes * bump flax dependency Co-authored-by:
jayendra <jayendra@infocusp.in> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Bhadresh Savani authored
* seq classification changes * fix tests
-
- 27 May, 2021 3 commits
-
-
Nicolas Patry authored
* Adding new argument `max_new_tokens` for generate. This is a proposal to add a new argument `max_new_tokens` to `generate`. This include a `MaxNewTokensCriteria` that enables callers that don't know about the token length ahead (like pipelines callers) to manage more easily the length of their generated output. * Adding a test for the user warning when both`max_length` and `max_new_tokens` are used together. * Removed redundant `no_grad`.
-
Josh Tanner authored
* rebuild deepspeed config for hyperparameter search * reformat code to fix style issues
-
Patrick von Platen authored
-
- 26 May, 2021 7 commits
-
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * add * indexing * correct a couple of tests * fix tests * add logits processor * finish top_k, top_p, temp * add docs * correct flax prng key default * improve generate * add generation docs * add docs * make style * revert model outputs change * make style * correct typo * fix tests * fix slow test * add raise * finish generation Co-authored-by:Patrick von Platen <patrick@huggingface.co>
-
Avital Oliver authored
-
joerenner authored
* changing find_batch_size to work with tokenizer outputs trainer_pt_utils.find_batch_size does not recognize the batch size of BatchEncoding objects. This can cause an error when a trainer relies on find_batch_size to report the number of observed examples in the evaluation loop. * Trigger CI Co-authored-by:jrenner <joseph.renner@inria.fr>
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * change dataclasses to flax ones * fix typo * fix jitted tests * fix bert & electra
-
talkhaldi authored
* Correcting comments to reflect correct tuple order In order to match the actual order (line 513 and 516, and as accessed in 968), I've changed the order mentioned in comments L962 and L966-967. * Update modeling_t5.py Updating another comment as well * Removing extra space * Fixing style and quality * style & quality * Update src/transformers/models/t5/modeling_t5.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Daniel Stancl authored
* Fix Bart * Fix Blenderbot{,_small} * Fix LED * Fix Marian * Fix MBart * Fix Pegasus * Fix T5 * Add test for generation with head_mask * Add a common TF test * Override a test for the LED model as head masking is not yet properly implemented * Remove all head_masks from input preparation for LED * Drop masking for T5 as it needs a bit of refactor -
francescorubbo authored
The feature extractor does not create tensors on the appropriate device, so we call `ensure_tensor_on_device` before feeding the processed inputs to the model.
-
- 25 May, 2021 9 commits
-
-
Ahmet Akko莽 authored
-
Stas Bekman authored
* create custom model on the flight * better wording * add update_from_string * cleanup * cleanup * Update src/transformers/configuration_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * more bool options * style * fix logger * add test * add the doc * assert on conflict of options Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Stas Bekman authored
* fix overflow in perplexity calc * use inf * fix
-
Patrick von Platen authored
* first try * finish
-
Sylvain Gugger authored
* Add option to long only once in multinode training * Use an alternate property
-
Wang Ran (姹劧) authored
-
Shiro T authored
-
Lysandre Debut authored
-
Lysandre Debut authored
-
- 24 May, 2021 7 commits
-
-
Sylvain Gugger authored
* [Trainer] Report both steps and num samples per second * Fix batch number * Update src/transformers/trainer_utils.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * Address review comments Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
Nick Lane-Smith authored
* typo2 * fix typo
-
Teven authored
* fixing flos bug/typo in non-distributed setting * storing flos every logging_interval
-
Sylvain Gugger authored
* Switch mem metrics flag * Update src/transformers/training_args.py Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
Sylvain Gugger authored
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * change pytorch import to flax import
-
Lysandre Debut authored
-
- 22 May, 2021 1 commit
-
-
ctheodoris authored
get_length_grouped_indices() in LengthGroupedSampler and DistributedLengthGroupedSampler is prohibitively slow for large number of megabatches (in test case takes hours for ~270k megabatches with 100 items each) due to slow list concatenation with sum(megabatches, []). Resolves: #11795 Co-authored-by:ctheodoris <cvtheodo@ds.dfci.harvard.edu>
-
- 21 May, 2021 7 commits
-
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * add flax glue link
-
Stas Bekman authored
* support zero.Init in from_config * no need for eval test
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * correct best seed for flax fine-tuning Co-authored-by:Patrick von Platen <patrick@huggingface.co>
-
Sylvain Gugger authored
-
yujun authored
-
Lysandre Debut authored
-
Patrick von Platen authored
* speed up flax glue * remove unnecessary line * remove folder * remove run in loop Co-authored-by:Patrick von Platen <patrick@huggingface.co>
-
- 20 May, 2021 2 commits
-
-
Keren Fuentes authored
* add separator for windows * fixes test_is_copy_consistent on Windows * fixing writing encoding issue on extended test (for Windows) * resolving comments
-
Michael Benayoun authored
Cleaner and more scalable implementation of symbolic tracing with torch.fx, and provides support for new architectures: - ALBERT - DistilBERT - MobileBERT - MegatronBERT - GPT2 - GPT Neo Co-authored-by:Michael Benayoun <michael@huggingface.co>
-