- 16 Feb, 2024 1 commit
-
-
Lysandre Debut authored
* Script & Manual edition * Update
-
- 27 Jul, 2022 1 commit
-
-
Loubna Ben Allal authored
* add info about megatron training * upload models and datasets from CodeParrot organization * upload models and datasets from CodeParrot organization * Update examples/research_projects/codeparrot/README.md Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * Update examples/research_projects/codeparrot/README.md Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * Update examples/research_projects/codeparrot/README.md Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * Update examples/research_projects/codeparrot/README.md Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * Update examples/research_projects/codeparrot/README.md Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * fix typo and add comment about codeparrot vs megatron Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com>
-
- 21 Jun, 2022 1 commit
-
-
Jia LI authored
* deduplication draft * update style * update style test * dummy test main * rename modules * rename functions * return extremes in deduplicate_clusters * update style * cast str for gzip * update doc string * time processing * use dataset map to compute minhash * fill value for short token * remove da map method * update style * use share object to multiprocess * update style * use f-string and minor fix Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by:
Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com> * update style * use module parameters * change ds_dedup to ds_filter * save ds_dedup * mv test to script tests * make jaccard threshold a parameter of deduplicate_dataset * update style * add doc strings * update style * add doc string for DuplicationIndex * save files into data dir * update readme * Update examples/research_projects/codeparrot/README.md Co-authored-by:
Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com> * make near deduplication optional * move near deduplication in README * Update examples/research_projects/codeparrot/README.md Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * use f string Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by:
Loubna Ben Allal <44069155+loubnabnl@users.noreply.github.com>
-
- 23 May, 2022 1 commit
-
-
Loubna Ben Allal authored
* average loss over batches and accumulated steps for tracking * fix layernorm weight decay * use AdamW from Pytorch instead of Transformers * add shuffling of sequences inside the batches * add shuffling of sequences inside the batches * add logging dir and reformat code * fix lr tracking * remove Mistral scaling * keep Mistral scaling * reformat code * fix error * fix error * use shuffling function from Pytorch * remove argument for shuffling batch sequences as it isn't optional * update package versions and install accelerate from source * remove unused package * Update loss average over accumulated steps Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * Update loss average over accumulated steps Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * use one shuffle buffer argument * compute avg_loss in one line Co-authored-by:
Loubna ben allal <loubnabenallal@gmail.com> Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com>
-
- 16 May, 2022 2 commits
-
-
Loubna Ben Allal authored
* add pretokenization arguments * add pretokenization script * add support for pretokenized data * reformat code * fix run command for training * fix model call from config * remove a package * add comments on pretokenization in the readme * remove explicit parallelization Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * update readme Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * update readme -remove username Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * update readme -remove username Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * keep data parallelization * reformat code * reformat code * update readme * reformat code * Update examples/research_projects/codeparrot/README.md Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by:
Loubna ben allal <loubnabenallal@gmail.com>
-
Loubna Ben Allal authored
* add new preprocessing arguments * add new filters * add new filters to readme * fix config and test count, update function names and docstrings * reformat code * update readme * Update readme * rename config_test filter Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * rename few_assignments filter Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * rename tokenizer in arguments Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * rename functions and add limit_line argument for config_test filter * update threshold for config_test filter Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by:
Loubna ben allal <loubnabenallal@gmail.com>
-
- 12 May, 2022 1 commit
-
-
Sylvain Gugger authored
* Black preview * Fixup too! * Fix check copies * Use the same version as the CI * Bump black
-
- 21 Apr, 2022 1 commit
-
-
Loubna Ben Allal authored
* add tflops logging and fix grad accumulation * add accelerate tracking and checkpointing * scale loss of last batch correctly * fix typo * compress loss computation Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * add resume from checkpoint argument * add load_state accelerate from checkpoint, register lr scheduler and add tflops function * reformat code * reformat code * add condition on path for resume checkpoint * combine if conditions Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * add source for tflops formula Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com>
-
- 13 Dec, 2021 1 commit
-
-
Nathan Cooper authored
* Add some nicety flags for better controlling evaluation. * Fix dependency issue with outdated requirement * Add additional flag to example to ensure eval is done * Wrap code into main function for accelerate launcher to find * Fix valid batch size flag in readme * Add note to install git-lfs when initializing/training the model * Update examples/research_projects/codeparrot/scripts/arguments.py Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * Update examples/research_projects/codeparrot/README.md Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * Revert "Wrap code into main function for accelerate launcher to find" This reverts commit ff11df1c810d4df198d04b827538eb4572147ba3. * Fix formatting issue * Move git-lfs instructions to installation section * Add a quick check before code generation for code evaluation * Fix styling issue * Update examples/research_projects/codeparrot/scripts/human_eval.py Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> * Make iterable dataset use passed in tokenizer rather than globally defined one Co-authored-by:
Leandro von Werra <lvwerra@users.noreply.github.com> Co-authored-by:
ncoop57 <nac33@students.uwf.edu>
-
- 02 Dec, 2021 1 commit
-
-
Leandro von Werra authored
* add readme skeleton * update readme * add initialization script * add deduplication script * add codeparrot training script * add code generation evaluation * add validation loss script * add requirements * update readme * tweak readme * make style * add highlights to readme * add CLIs to scripts * add tokenizer training script * add docstring to constant length dataset * fix defaults in arguments * update readme with cli * move image to hub * tweaks of readme * fix cli commands * add author * explain env variables * fix formatting * Update examples/research_projects/codeparrot/README.md Co-authored-by:
lewtun <lewis.c.tunstall@gmail.com> * Apply suggestions from code review Co-authored-by:
lewtun <lewis.c.tunstall@gmail.com> * replace generic with gpt2 tokenizer Co-authored-by:
lewtun <lewis.c.tunstall@gmail.com>
-