- 23 May, 2022 1 commit
-
-
Loubna Ben Allal authored
* average loss over batches and accumulated steps for tracking * fix layernorm weight decay * use AdamW from Pytorch instead of Transformers * add shuffling of sequences inside the batches * add shuffling of sequences inside the batches * add logging dir and reformat code * fix lr tracking * remove Mistral scaling * keep Mistral scaling * reformat code * fix error * fix error * use shuffling function from Pytorch * remove argument for shuffling batch sequences as it isn't optional * update package versions and install accelerate from source * remove unused package * Update loss average over accumulated steps Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * Update loss average over accumulated steps Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * use one shuffle buffer argument * compute avg_loss in one line Co-authored-by:
Loubna ben allal <loubnabenallal@gmail.com> Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com>
-
- 19 May, 2022 1 commit
-
-
ddobokki authored
-
- 18 May, 2022 3 commits
-
-
Zachary Mueller authored
Fix metric calculation in examples and setup tests to run on multi-gpu for no_trainer scripts (#17331) * Fix length in no_trainer examples * Add setup and teardown * Use new accelerator config generator to automatically make tests able to run based on environment
-
Sylvain Gugger authored
-
mraunak authored
* Add information gain filtration algorithm * Complying with black requirements * Added author * Fixed import order * flake8 corrections Co-authored-by:Javier Turek <javier.turek@intel.com>
-
- 17 May, 2022 1 commit
-
-
regisss authored
- Add --ignore_mismatched_sizes argument to classification examples - Expand the error message when loading a model whose head dimensions are different from expected dimensions
-
- 16 May, 2022 3 commits
-
-
Loubna Ben Allal authored
* add pretokenization arguments * add pretokenization script * add support for pretokenized data * reformat code * fix run command for training * fix model call from config * remove a package * add comments on pretokenization in the readme * remove explicit parallelization Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * update readme Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * update readme -remove username Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * update readme -remove username Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * keep data parallelization * reformat code * reformat code * update readme * reformat code * Update examples/research_projects/codeparrot/README.md Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by:
Loubna ben allal <loubnabenallal@gmail.com>
-
Loubna Ben Allal authored
* add new preprocessing arguments * add new filters * add new filters to readme * fix config and test count, update function names and docstrings * reformat code * update readme * Update readme * rename config_test filter Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * rename few_assignments filter Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * rename tokenizer in arguments Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * rename functions and add limit_line argument for config_test filter * update threshold for config_test filter Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by:
Loubna ben allal <loubnabenallal@gmail.com>
-
Kenneth Enevoldsen authored
* fixed bug run_mlm_flax_stream.py Fixed bug caused by an update to tokenizer keys introduced in recent transformers versions (between `4.6.2` and `4.18.0`) where additional keys were introduced to the tokenizer output. * Update run_mlm_flax_stream.py * adding missing paranthesis * formatted to black * remove cols from dataset instead * reformat to black * moved rem. columns to map * formatted to black Co-authored-by:KennethEnevoldsen <kennethcenevolsen@gmail.com>
-
- 12 May, 2022 2 commits
-
-
Sylvain Gugger authored
* Black preview * Fixup too! * Fix check copies * Use the same version as the CI * Bump black
-
Lysandre Debut authored
-
- 09 May, 2022 1 commit
-
-
Zachary Mueller authored
-
- 04 May, 2022 4 commits
-
-
Zachary Mueller authored
-
dependabot[bot] authored
Bumps [notebook](http://jupyter.org ) from 6.4.1 to 6.4.10. --- updated-dependencies: - dependency-name: notebook dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
dependabot[bot] authored
Bumps [notebook](http://jupyter.org ) from 6.4.1 to 6.4.10. --- updated-dependencies: - dependency-name: notebook dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
Thomas Wang authored
-
- 03 May, 2022 1 commit
-
-
Pavel Belevich authored
-
- 02 May, 2022 3 commits
-
-
Zachary Mueller authored
* Update all examples to properly calculate progress bar
-
Zachary Mueller authored
* Propagate and fix imports
-
yujun authored
* add torch.no_grad when in eval mode * make style quality
-
- 28 Apr, 2022 2 commits
-
-
Zachary Mueller authored
-
conan1024hao authored
* dd parameter --config_overrides for run_mlm_wwm.py * linter
-
- 27 Apr, 2022 5 commits
-
-
Zachary Mueller authored
* Fixup all examples
-
Sylvain Gugger authored
* Fix multiple deletions of the same files in save_pretrained * Add is_main_process argument
-
Leonid Boytsov authored
1. Fixes evaluation errors popping up when you train/eval on squad v2 (one was newly encountered and one that was previously reported Running SQuAD 1.0 sample command raises IndexError #15401 but not completely fixed). 2. Removes boolean arguments that don't use store_true. Please, don't use these: *ANY non-empty string is being converted to True in this case and this clearly is not the desired behavior (and it creates a LOT of confusion). 3. All no-trainer test scripts are now saving metric values in the same way (with the right prefix eval_), which is consistent with the trainer-based versions. 4. Adds forgotten model.eval() in the no-trainer versions. This improved some results, but not everything (see the discussion in the end). Please, see the F1 scores and the discussion below.
-
NielsRogge authored
* Add first draft * Improve script and README * Improve README * Apply suggestions from code review * Improve script, add link to resulting model * Add corresponding test * Adjust learning rate
-
Anton Lozhkov authored
* Avoid repeated per-lang filtering * Language groups and logits preprocessing * Style
-
- 25 Apr, 2022 2 commits
-
-
-
Sanchit Gandhi authored
-
- 21 Apr, 2022 1 commit
-
-
Loubna Ben Allal authored
* add tflops logging and fix grad accumulation * add accelerate tracking and checkpointing * scale loss of last batch correctly * fix typo * compress loss computation Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * add resume from checkpoint argument * add load_state accelerate from checkpoint, register lr scheduler and add tflops function * reformat code * reformat code * add condition on path for resume checkpoint * combine if conditions Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * add source for tflops formula Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com>
-
- 20 Apr, 2022 1 commit
-
-
Zachary Mueller authored
-
- 19 Apr, 2022 5 commits
-
-
Jeevesh Juneja authored
* Correct Logging of Eval metric to Tensorboard An empty dictionary ``eval_metrics`` was being logged, is replaced by ``eval_metric`` which is the output dictionary of ``metric.compute()``. * Remove unused variable
-
NielsRogge authored
* Add first draft * Improve README and run fixup * Make script aligned with other scripts, improve README * Improve script and add test * Remove print statement * Apply suggestions from code review * Add num_labels to make test pass * Improve README
-
Wonjae Kim authored
-
Suraj Patil authored
* begin do_init * add params_shape_tree * raise error if params are accessed when do_init is False * don't allow do_init=False when keys are missing * make shape tree a property * assign self._params at the end * add test for do_init * add do_init arg to all flax models * fix param setting * disbale do_init for composite models * update test * add do_init in FlaxBigBirdForMultipleChoice * better names and errors * improve test * style * add a warning when do_init=False * remove extra if * set params after _required_params * add test for from_pretrained * do_init => _do_init * chage warning to info * fix typo * add params in init_weights * add params to gpt neo init * add params to init_weights * update do_init test * Trigger CI * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * update template * trigger CI * style * style * fix template Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
NielsRogge authored
* Add first draft from previous PR * First draft * Improve README and remove num_labels * Make script more aligned with other scripts * Improve README and apply suggestion from code review
-
- 15 Apr, 2022 1 commit
-
-
NielsRogge authored
-
- 14 Apr, 2022 1 commit
-
-
NielsRogge authored
* Improve README * Make dataset_name argument optional * Improve local data * Fix bug * Improve README some more * Apply suggestions from code review * Improve README Co-authored-by:Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
-
- 13 Apr, 2022 2 commits
-
-
Zachary Mueller authored
* Change tracking to store_true * Remove step param and use it in the log dictionary directly * use vars(args) when passing args to init_trackers * Include tracking tests since tensorboard is already a dep
-
Tu Vu authored
* Add self-training code for text-classification * Add self-training code for text-classification * Add self-training code for text-classification * Add self-training code for text-classification * Add self-training code for text-classification * Delete strata
-