- 06 Jun, 2024 1 commit
-
-
dependabot[bot] authored
Bump transformers in /examples/research_projects/codeparrot Bumps [transformers](https://github.com/huggingface/transformers) from 4.19.0 to 4.38.0. - [Release notes](https://github.com/huggingface/transformers/releases) - [Commits](https://github.com/huggingface/transformers/compare/v4.19.0...v4.38.0 ) --- updated-dependencies: - dependency-name: transformers dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
- 01 May, 2024 1 commit
-
-
dependabot[bot] authored
Bump torch in /examples/research_projects/codeparrot Bumps [torch](https://github.com/pytorch/pytorch) from 1.11.0 to 1.13.1. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v1.11.0...v1.13.1 ) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by:
dependabot[bot] <support@github.com> Co-authored-by:
dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
-
- 21 Jun, 2022 1 commit
-
-
Jia LI authored
* deduplication draft * update style * update style test * dummy test main * rename modules * rename functions * return extremes in deduplicate_clusters * update style * cast str for gzip * update doc string * time processing * use dataset map to compute minhash * fill value for short token * remove da map method * update style * use share object to multiprocess * update style * use f-string and minor fix Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by:
Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com> * update style * use module parameters * change ds_dedup to ds_filter * save ds_dedup * mv test to script tests * make jaccard threshold a parameter of deduplicate_dataset * update style * add doc strings * update style * add doc string for DuplicationIndex * save files into data dir * update readme * Update examples/research_projects/codeparrot/README.md Co-authored-by:
Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com> * make near deduplication optional * move near deduplication in README * Update examples/research_projects/codeparrot/README.md Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * use f string Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by:
Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com>
-
- 23 May, 2022 1 commit
-
-
Loubna Ben Allal authored
* average loss over batches and accumulated steps for tracking * fix layernorm weight decay * use AdamW from Pytorch instead of Transformers * add shuffling of sequences inside the batches * add shuffling of sequences inside the batches * add logging dir and reformat code * fix lr tracking * remove Mistral scaling * keep Mistral scaling * reformat code * fix error * fix error * use shuffling function from Pytorch * remove argument for shuffling batch sequences as it isn't optional * update package versions and install accelerate from source * remove unused package * Update loss average over accumulated steps Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * Update loss average over accumulated steps Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * use one shuffle buffer argument * compute avg_loss in one line Co-authored-by:
Loubna ben allal <loubnabenallal@gmail.com> Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com>
-
- 11 Apr, 2022 1 commit
-
-
Jia LI authored
* add simple multi gpu complet * add human_eval_multi_gpu * use copy strategy to distribute across gpu, to avoid padding * add doc string * update code style * use task id to arrange output * truncate input to avoid zero pad * Stop the copy mechanism * update style * restore copies to scale better in distributed mode * update style * replace human eval * Apply suggestions from code review 1. Tokenize all input at the same time 2. use attention_mask to get the input length 3. other small fixes Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * correct typo and update docstring * update code style * remove num sample division constraint * remove max len calculation * use accelerator.gather once to speed up * use accelerate set_seed; update accelerate version * correct gather bug Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com>
-
- 23 Dec, 2021 1 commit
-
-
Leandro von Werra authored
-
- 13 Dec, 2021 1 commit
-
-
Nathan Cooper authored
* Add some nicety flags for better controlling evaluation. * Fix dependency issue with outdated requirement * Add additional flag to example to ensure eval is done * Wrap code into main function for accelerate launcher to find * Fix valid batch size flag in readme * Add note to install git-lfs when initializing/training the model * Update examples/research_projects/codeparrot/scripts/arguments.py Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * Update examples/research_projects/codeparrot/README.md Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * Revert "Wrap code into main function for accelerate launcher to find" This reverts commit ff11df1c810d4df198d04b827538eb4572147ba3. * Fix formatting issue * Move git-lfs instructions to installation section * Add a quick check before code generation for code evaluation * Fix styling issue * Update examples/research_projects/codeparrot/scripts/human_eval.py Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * Make iterable dataset use passed in tokenizer rather than globally defined one Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by:
ncoop57 <nac33@students.uwf.edu>
-
- 02 Dec, 2021 1 commit
-
-
Leandro von Werra authored
* add readme skeleton * update readme * add initialization script * add deduplication script * add codeparrot training script * add code generation evaluation * add validation loss script * add requirements * update readme * tweak readme * make style * add highlights to readme * add CLIs to scripts * add tokenizer training script * add docstring to constant length dataset * fix defaults in arguments * update readme with cli * move image to hub * tweaks of readme * fix cli commands * add author * explain env variables * fix formatting * Update examples/research_projects/codeparrot/README.md Co-authored-by:
lewtun <lewis.c.tunstall@gmail.com> * Apply suggestions from code review Co-authored-by:
lewtun <lewis.c.tunstall@gmail.com> * replace generic with gpt2 tokenizer Co-authored-by:
lewtun <lewis.c.tunstall@gmail.com>
-