- 26 Apr, 2021 1 commit
-
-
Patrick von Platen authored
-
- 23 Apr, 2021 2 commits
-
-
Sylvain Gugger authored
* Initial support for upload to hub * push -> upload * Fixes + examples * Fix torchhub test * Torchhub test I hate you * push_model_to_hub -> push_to_hub * Apply mixin to other pretrained models * Remove ABC inheritance * Add tests * Typo * Run tests * Install git-lfs * Change approach * Add push_to_hub to all * Staging test suite * Typo * Maybe like this? * More deps * Cache * Adapt name * Quality * MOAR tests * Put it in testing_utils * Docs + torchhub last hope * Styling * Wrong method * Typos * Update src/transformers/file_utils.py Co-authored-by:
Julien Chaumond <julien@huggingface.co> * Address review comments * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Julien Chaumond <julien@huggingface.co> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Teven authored
* Fixed trainer total_flos relaoding in distributed mode * logging flos at the end of training
-
- 22 Apr, 2021 1 commit
-
-
Sylvain Gugger authored
* Fix Trainer with remove_unused_columns=False * Typo
-
- 21 Apr, 2021 1 commit
-
-
Stas Bekman authored
This PR fixes a bug that most likely somehow got exposed (not caused) by https://github.com/huggingface/transformers/pull/11318 - surprisingly the same test worked just fine before that other PR.
-
- 20 Apr, 2021 2 commits
-
-
Sylvain Gugger authored
* Update to use datasets remove_cloumns method * Quality
-
Sylvain Gugger authored
-
- 19 Apr, 2021 2 commits
-
-
Sylvain Gugger authored
-
Stas Bekman authored
* fix the placement on device with fp16_full_eval * deepspeed never goes on device
-
- 16 Apr, 2021 1 commit
-
-
Sylvain Gugger authored
* Bulk of the work * Polish and tests * Update QA Trainer * Avoid breaking the predict method * Deprecation warnings * Store real eval dataloder * Get eval dataset reference before wrap
-
- 15 Apr, 2021 1 commit
-
-
Sylvain Gugger authored
-
- 14 Apr, 2021 1 commit
-
-
Sylvain Gugger authored
* IterableDatasetShard * Test and integration in Trainer * Update src/transformers/trainer_pt_utils.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Style Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 08 Apr, 2021 4 commits
-
-
Stas Bekman authored
* make fairscale and deepspeed setup extras * fix default * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * no reason not to ask for the good version * update the CIs Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Stas Bekman authored
* solve "scheduler before optimizer step" warning * style * correct the state evaluation test
-
Stas Bekman authored
* synced gpus * fix * fix * need to use t5-small for quality tests * notes * complete merge * fix a disappearing std stream problem * start zero3 tests * wip * tune params * sorting out the pre-trained model loading * reworking generate loop wip * wip * style * fix tests * split the tests * refactor tests * wip * parameterized * fix * workout the resume from non-ds checkpoint pass + test * cleanup * remove no longer needed code * split getter/setter functions * complete the docs * suggestions * gpus and their compute capabilities link * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * style * remove invalid paramgd * automatically configure zero3 params that rely on hidden size * make _get_resized_embeddings zero3-aware * add test exercising resize_token_embeddings() * add docstring Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Jannis Born authored
* fix: docstrings in prediction_step * ci: Satisfy line length requirements * ci: character length requirements
-
- 31 Mar, 2021 2 commits
-
-
Sylvain Gugger authored
* Replace is_sagemaker_distributed_available * Merge SageMakerTrainer into Trainer * Test with shorter condition * Put back deleted line * Deprecate SageMakerTrainer and SageMakerTrainingArguments * Apply suggestions from code review Co-authored-by:
Philipp Schmid <32632186+philschmid@users.noreply.github.com> Co-authored-by:
Philipp Schmid <32632186+philschmid@users.noreply.github.com>
-
Sylvain Gugger authored
* First third * Styling and fix mistake * Quality * All the rest * Treat %s and %d * typo * Missing ) * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 29 Mar, 2021 1 commit
-
-
pcuenca authored
A new argument `length_column_name` has been added to `TrainingArguments`, with default value `"length"`. If this column exists and `group_by_length` is `True`, the train sampler will use it for grouping rather than computing it before training starts. This is an optimization that allows the user to prepare data for fast processing, preventing sequential access to the dataset as described in issue #10909.
-
- 24 Mar, 2021 1 commit
-
-
imzhengzx authored
the orignal code in line 246 is ``` tokenizer: Optional["PreTrainedTokenizerBase"] = None, ``` it should be ``` tokenizer: Optional[PreTrainedTokenizerBase] = None, ```
-
- 23 Mar, 2021 1 commit
-
-
Bhadresh Savani authored
-
- 22 Mar, 2021 2 commits
-
-
Ruan Chaves authored
* Modify the _hp_search_setup method on the Trainer class to handle the wandb argument passed by Ray Tune to model config. * Reformat single quotes as double quotes.
-
Sidd Karamcheti authored
Add simple one character fix so that on_step_begin and on_step_end are called at the right times (#10839)
-
- 18 Mar, 2021 1 commit
-
-
Sylvain Gugger authored
* Fix distributed evaluation * Use logger
-
- 17 Mar, 2021 3 commits
-
-
Mansi Mane authored
* Added debug prints * Added config * Added prints * Added prints * Added extra samples to SequentialDistributedSampler * Added extra samples to SequentialDistributedSampler Updated SequentialDistributedSampler call * Added deubg prints * Removed extra prints * Making predicitons and labels multiple of batchsize * updated number of microbatches * Removed extra prints * Made start_remainder similar to DistributedSamplerWithLoop * Minor spacing update * Added debug prints Added config Added prints Added prints * Added extra samples to SequentialDistributedSampler Updated SequentialDistributedSampler call Added extra samples to SequentialDistributedSampler Added deubg prints Removed extra prints Making predicitons and labels multiple of batchsize updated number of microbatches Removed extra prints Squashing redundant commits * Made start_remainder similar to DistributedSamplerWithLoop Minor spacing update Made start_remainder similar to DistributedSamplerWithLoop * Test and styling * Rename test Co-authored-by:Sylvain Gugger <sylvain.gugger@gmail.com>
-
Stas Bekman authored
-
Stas Bekman authored
* deepspeed checkpoint loading code plus tests * style * style
-
- 16 Mar, 2021 2 commits
-
-
Cheng Li authored
* pass hf optimizer and scheduler to deepspeed if not specified in ds config * pass hf optimizer and scheduler to deepspeed if not specified in ds config * update * make init_deepspeed support config dict * fix docstring formatting * clean up trainer's comments * add new tests * fix type * composit argparse doesn't work * style * add a new test, rename others * document new functionality * complete tests, add docs * style * correct level * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add new methods to the doc * must tell DS we are using a non-native optimizer * add protection against cpu_offload + HF optimizer combo * fix the cli overrides * sync docs + tests * restore AdamW * better docs * need new version * no longer needed * remove outdate information * refactor duplicated code Co-authored-by:
Stas Bekman <stas@stason.org> Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sylvain Gugger authored
* Add DistributedSamplerWithLoop * Fix typo * Test and small fix
-
- 15 Mar, 2021 2 commits
-
-
Sylvain Gugger authored
* Handle save differently * Missing imports * Fix typo * Adapt to recent changes in save_pretrained * Forgotten brackets * Optimizer load * Fix world size * Deal wth None * Remove needless self
-
Sylvain Gugger authored
-
- 12 Mar, 2021 1 commit
-
-
Sylvain Gugger authored
* Add auto_wrap option in fairscale integration * Style
-
- 11 Mar, 2021 1 commit
-
-
Sylvain Gugger authored
-
- 10 Mar, 2021 1 commit
-
-
Philipp Schmid authored
* renamed logging to hf_logging * changed logging from hf_logging to logging and loggin to native_logging * removed everything trying to fix import Trainer error * adding imports again * added custom add_handler function to logging.py * make style * added remove_handler * added another conditional to assert
-
- 09 Mar, 2021 1 commit
-
-
Sylvain Gugger authored
* Hotfix fairscale FSDP * Evaluation works * Save on process zero
-
- 08 Mar, 2021 4 commits
-
-
Sylvain Gugger authored
* Check layer types for Optimizer construction * Duplicate class
-
Sylvain Gugger authored
This reverts commit b35e7b68.
-
Sylvain Gugger authored
-
Stas Bekman authored
-
- 05 Mar, 2021 1 commit
-
-
Joakim Warholm authored
-