- 17 Apr, 2024 1 commit
-
-
Shane A authored
* Add OLMo using add-new-model-like with Llama * Fix incorrect tokenizer for OLMo * Copy-paste relevant OLMo methods and their imports * Add OLMo config * Modify OLMo config to follow HF conventions * Remove unneeded Llama code from OLMo model * Add ability for OLMo model to output attentions * Add OLMoPreTrainedModel and OLMoModel * Add OLMoForCausalLM * Minor fixes to OLMo model for style and missing functions * Implement OLMo tokenizer * Implement OLMo to HF conversion script * Add tests for OLMo model * Add tests for OLMo fast tokenizer * Add auto-generated dummy objects * Remove unimplemented OLMo classes from auto and init classes and re-format * Add README and associated auto-generated files * Use OLMo names for common properties * Run make fixup * Remove `|` from OLMo typing * Remove unneeded tokenization_olmo.py * Revert model, config and converter to add-new-model-like Llama * Move logic for adding bos/eos token into GPTNeoxTokenizerFast * Change OLMoConfig defaults to match OLMo-7B * Use GPTNeoXToknizerFast in OLMo tokenizer tests * Modify auto-generated OLMoModelTests to work for OLMo * Add non-parametric layer norm OLMoLayerNorm * Update weight conversion script for OLMo * Fix __init__ and auto structure for OLMo * Fix errors from make fixup * Remove OLMoTokenizerFast from documentation * Add missing 'Copied from' for OLMoModel._update_causal_mask * Run make fix-copies * Rearrange string replacements in OLMoForCausalLM Copied from * Move OLMo and Llama CausalLM.forward example into global constants * Fix OLMO_GENERATION_EXAMPLE doc string typo * Add option for qkv clipping to OLMo * Rearrange OLMoConfig kwargs in convert_olmo_weights_to_hf * Add clip_qkv to OLMoConfig in convert_olmo_weights_to_hf * Fix OLMo tokenization bug using conversion script * Keep model in full precision after conversion * Do not add eos token automatically * Update references to OLMo model in HF Hub * Do not add eos token during encoding by default * Fix Llama generation example * Run make fixup * OLMo 7B integration test fix * Remove unneeded special case for OLMoConfig * OLMo 7B Twin 2T integration test fix * Fix test_model_7b_greedy_generation * Remove test_compile_static_cache * Fix OLMo and Llama generation example * Run make fixup * Revert "OLMo 7B integration test fix" This reverts commit 4df56a4b150681bfa559846f40e9b7b7f97d7908. * Revert "OLMo 7B Twin 2T integration test fix" This reverts commit 9ff65a4a294ace89ab047b793ca55e623a9ceefc. * Ungate 7B integration tests and fix greedy generation test * Add retries for flaky test_eager_matches_sdpa_generate * Fix output of doc example for OLMoForCausalLM.forward * Downsize OLMo doc test for OLMoForCausalLM.forward to 1B model * Try fix incorrect characters in OLMoForCausalLM.forward doct test * Try fix incorrect characters in OLMoForCausalLM.forward doc test using end quotes * Remove pretraining_tp from OLMo config and model * Add missing 'Copied from' instances * Remove unneeded causal_mask from OLMoModel * Revert Llama changes * Ignore copy for OLMoForCausalLM.forward * Change 'OLMo' to 'Olmo' in classes * Move minimal OLMo tokenization tests to model tests * Add missed 'Copied from' for repeat_kv
-
- 11 Dec, 2020 1 commit
-
-
Sylvain Gugger authored
* Reorganize example folder * Continue reorganization * Change requirements for tests * Final cleanup * Finish regroup with tests all passing * Copyright * Requirements and readme * Make a full link for the documentation * Address review comments * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Add symlink * Reorg again * Apply suggestions from code review Co-authored-by:
Thomas Wolf <thomwolf@users.noreply.github.com> * Adapt title * Update to new strucutre * Remove test * Update READMEs Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Thomas Wolf <thomwolf@users.noreply.github.com>
-
- 20 Jun, 2020 1 commit
-
-
Kevin Canwen Xu authored
* Add BERT Loses Patience (Patience-based Early Exit) * update model archive * update format * sort import * flake8 * Add results * full results * align the table * refactor to inherit * default per gpu eval = 1 * Formatting * Formatting * isort * modify readme * Add check * Fix format * Fix format * Doc strings * ALBERT & BERT for sequence classification don't inherit from the original anymore * Remove incorrect comments * Remove incorrect comments * Remove incorrect comments * Sync up with new code * Sync up with new code * Add a test * Add a test * Add a test * Add a test * Add a test * Add a test * Finishing up!
-
- 03 Mar, 2020 1 commit
-
-
Sam Shleifer authored
* Rename and improve example * Add test * slightly faster test * style * This breaks remy prolly * shorter test string * no slow * newdir structure * New tree * Style * shorter * docs * clean * Attempt future import * more import hax
-
- 06 Jan, 2020 2 commits
-
-
alberduris authored
-
alberduris authored
-
- 22 Dec, 2019 1 commit
-
-
Aymeric Augustin authored
-
- 26 Sep, 2019 1 commit
-
-
thomwolf authored
-
- 05 Jul, 2019 1 commit
-
-
thomwolf authored
-
- 02 Jul, 2019 1 commit
-
-
thomwolf authored
-