- 21 Feb, 2024 1 commit
-
-
Ekaterina Aidova authored
* support SDPA Attention in stablelm * add integration test * add fallback for output_attentions * Update src/transformers/models/stablelm/modeling_stablelm.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/stablelm/test_modeling_stablelm.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/stablelm/modeling_stablelm.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * handle non-contiguous states --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
- 14 Feb, 2024 1 commit
-
-
Jonathan Tow authored
* Add `StableLM` * fix(model): re-create from `huggingface-cli add-new-model-like persimmon` * fix: re-add changes to address comments * fix(readme): add links to paper * fix(tokenization_auto): remove `GPTNeoXTokenizerFastFast` ref * fix(tests): re-add `@slow` decorator to integration tests * fix(tests): import slow... * fix(readme_hd): remove whitespace edit * fix(tokenizer): auto tokenizer tuple * skip doctests for `modeling_stablelm`
-
- 07 Dec, 2023 2 commits
-
-
fxmarty authored
fix device of mask in tests
-
Alex McKinney authored
* Copies `modeling_flax_gpt_neo.py` to start * MLP Block. WIP Attention and Block * Adds Flax implementation of `LlamaMLP` Validated with in-file test. Some slight numeric differences, but assuming it isn't an issue * Adds `FlaxLlamaRMSNorm` layer `flax.linen` includes `RMSNorm` layer but not necessarily in all versions. Hence, we add in-file. * Adds FlaxLlamaAttention Copied from GPT-J as it has efficient caching implementation as well as rotary embeddings. Notice numerically different, but not by a huge amount. Needs investigating * Adds `FlaxLlamaDecoderLayer` numerically inaccurate, debugging.. * debugging rotary mismatch gptj uses interleaved whilst llama uses contiguous i think they match now but still final result is wrong. maybe drop back to just debugging attention layer? * fixes bug with decoder layer still somewhat numerically inaccurate, but close enough for now * adds markers for what to implement next the structure here diverges a lot from the PT version. not a big fan of it, but just get something working for now * implements `FlaxLlamaBlockCollection`] tolerance must be higher than expected, kinda disconcerting * Adds `FlaxLlamaModule` equivalent PyTorch model is `LlamaModel` yay! a language model
馃 * adds `FlaxLlamaForCausalLMModule` equivalent to `LlamaForCausalLM` still missing returning dict or tuple, will add later * start porting pretrained wrappers realised it probably needs return dict as a prereq * cleanup, quality, style * readds `return_dict` and model output named tuples * (tentatively) pretrained wrappers work馃敟 * fixes numerical mismatch in `FlaxLlamaRMSNorm` seems `jax.lax.rsqrt` does not match `torch.sqrt`. manually computing `1 / jax.numpy.sqrt` results in matching values. * [WIP] debugging numerics * numerical match I think issue was accidental change of backend. forcing CPU fixes test. We expect some mismatch on GPU. * adds in model and integration tests for Flax Llama summary of failing: - mul invalid combination of dimensions - one numerical mismatch - bf16 conversion (maybe my local backend issue) - params are not FrozenDict * adds missing TYPE_CHECKING import and `make fixup` * adds back missing docstrings needs review on quality of docstrings, not sure what is required. Furthermore, need to check if `CHECKPOINT_FOR_DOC` is valid. See TODO * commenting out equivalence test as can just use common * debugging * Fixes bug where mask and pos_ids were swapped in pretrained models This results in all tests passing now馃敟 * cleanup of modeling file * cleanup of test file * Resolving simpler review comments * addresses more minor review comments * fixing introduced pytest errors from review * wip additional slow tests * wip tests need to grab a GPU machine to get real logits for comparison otherwise, slow tests should be okay * `make quality`, `make style` * adds slow integration tests - checking logits - checking hidden states - checking generation outputs * `make fix-copies` * fix mangled function following `make fix-copies` * adds missing type checking imports * fixes missing parameter checkpoint warning * more finegrained 'Copied from' tags avoids issue of overwriting `LLAMA_INPUTS_DOCSTRING` * swaps import guards ??? how did these get swapped initially? * removing `inv_freq` again as pytorch version has now removed * attempting to get CI to pass * adds doc entries for llama flax models * fixes typo in __init__.py imports * adds back special equivalence tests these come from the gpt neo flax tests. there is special behaviour for these models that needs to override the common version * overrides tests with dummy to see if CI passes need to fill in these tests later * adds my contribution to docs * `make style; make quality` * replaces random masking with fixed to work with flax version * `make quality; make style` * Update src/transformers/models/llama/modeling_flax_llama.py Co-authored-by:Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update src/transformers/models/llama/modeling_flax_llama.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update src/transformers/models/llama/modeling_flax_llama.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update src/transformers/models/llama/modeling_flax_llama.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update src/transformers/models/llama/modeling_flax_llama.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update src/transformers/models/llama/modeling_flax_llama.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * updates `x`->`tensor` in `rotate_half` * addresses smaller review comments * Update docs/source/en/model_doc/llama.md Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * adds integration test class * adds `dtype` to rotary embedding to cast outputs * adds type to flax llama rotary layer * `make style` * `make fix-copies` * Apply suggestions from code review Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * applies suggestions from review * Update modeling_flax_llama.py * `make fix-copies` * Update tests/models/llama/test_modeling_llama.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update src/transformers/models/llama/modeling_flax_llama.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * fixes shape mismatch in FlaxLlamaMLP * applies some suggestions from reviews * casts attn output logits to f32 regardless of dtype * adds attn bias using `LlamaConfig.attention_bias` * adds Copied From comments to Flax Llama test * mistral and persimmon test change -copy from llama * updates docs index * removes Copied from in tests it was preventing `make fix-copies` from succeeding * quality and style * ignores FlaxLlama input docstring * adds revision to `_CHECKPOINT_FOR_DOC` * repo consistency and quality * removes unused import * removes copied from from Phi test now diverges from llama tests following FlaxLlama changes * adds `_REAL_CHECKPOINT_FOR_DOC` * removes refs from pr tests * reformat to make ruff happy --------- Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
-
- 31 Oct, 2023 1 commit
-
-
Hz, Ji authored
* device agnostic models testing * add decorator `require_torch_fp16` * make style * apply review suggestion * Oops, the fp16 decorator was misused
-
- 12 Oct, 2023 1 commit
-
-
Yih-Dar authored
* fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 11 Oct, 2023 1 commit
-
-
Yih-Dar authored
* copied statement for test files --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 02 Oct, 2023 1 commit
-
-
Arthur authored
* fix wav2vec2 * nit * stash * one more file to update * fix byt5 * vocab size is 256, don't change that! * use other revision * test persimon in smaller size * style * tests * nits * update add tokens from pretrained * test tokenization * nits * potential fnet fix? * more nits * nits * correct test * assert close * udpate * ouch * fix it * some more nits * FINALLU * use `adept` checkpoints * more adept checkpoints * that was invlved!
-
- 29 Sep, 2023 1 commit
-
-
Yih-Dar authored
skip Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 12 Sep, 2023 1 commit
-
-
Arthur authored
* intiial commit * updates * nits * update conversion script * update conversion script * use path to load * add tips etc * some modeling logic * modeling update * more nits * nits * normal layer norm * update config and doc * nits * update doc remove unused * update * fix inits and stuff * fixup * revert wrong changes * updates * more nits * add default config values to the configuration file * fixup happy * update * 2 tests left * update readmes * more nits * slow test and more documentation * update readme * fix licences * styling * use fast if possible when saving tokenizer * remove todo * remove tokenization tests * small last nits * Apply suggestions from code review Co-authored-by:
Matt <Rocketknight1@users.noreply.github.com> * nits to skip the timout doctest * fix integration test * fix test * update eos token * update to allow fast tokenization * styling * fix codeLlama as well for the update post processor * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add more copied from statements * update * doc passes doctest * remove `# final layer norm?` * change docstring prompot * update * Update README.md Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * don't doctest the conversion script as it requires more packages * don't init a model in the config * oups * fix doctest --------- Co-authored-by:
Matt <Rocketknight1@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 25 Aug, 2023 1 commit
-
-
Arthur authored
* add all * Revert "Delete .github directory" This reverts commit 9b0ff7b052e2b20b629a26fb13606b78a42944d1. * make conversion script backward compatible * fixup * more styling * copy to llama changes * fix repo consistency * nits * document correct classes * updates * more fixes * nits * update auto mappings * add readmes * smallupdates * llama-code replace with llama_code * make fixup * updates to the testsing suite * fix fast nits * more small fixes * fix decode * fix template processing * properly reset the normalizer * nits processor * tokenization tests pass * styling * last tests * additional nits * one test is left * nits Co-authored-by faabian <faabian@users.noreply.github.com> * update failing test * fixup * remove decode infilling users should handle it on their onw after generation, padding can be a problem * update * make test slow and more meaningfull * fixup * doc update * fixup * Apply suggestions from code review * add kwargs doc * tokenizer requires `requires_backend` * type requires_backends * CodeLlama instead of LlamaCode * more name cahnges * nits * make doctests happy * small pipeline nits * last nit * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * update * add codellama to toctree --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 02 Aug, 2023 1 commit
-
-
Yih-Dar authored
* CI with layers=2 --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 21 Jul, 2023 1 commit
-
-
Arthur authored
* pad token should be None by default * fix tests * nits
-
- 18 Jul, 2023 1 commit
-
-
Arthur authored
* add llama * add other readmes * update padding id in readme * add link to paper * fix paths and tokenizer * more nits * styling * fit operation in 2 lines when possible * nits * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add form * update reademe * update readme, we don't have a default pad token * update test and tokenization * LLaMA instead of Llama * nits * add expected text * add greeedy output * styling * Update src/transformers/models/llama/modeling_llama.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * sequential device map * skip relevant changes --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 13 Jul, 2023 1 commit
-
-
Joao Gante authored
* add rope_scaling * tmp commit * add gptneox * add tests * GPTNeoX can now handle long inputs, so the pipeline test was wrong * Update src/transformers/models/open_llama/configuration_open_llama.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove ntk * remove redundant validation --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 22 Mar, 2023 1 commit
-
-
Joao Gante authored
* tmp commit * beef up llama tests
-
- 17 Mar, 2023 1 commit
-
-
lewtun authored
* Add LlamaForSequenceClassification * Update src/transformers/models/llama/modeling_llama.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/llama/modeling_llama.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Add docstring * Add test * Add input embedding getter and setter * Remove dead code --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
- 16 Mar, 2023 1 commit
-
-
Jason Phang authored
* LLaMA * sharding and docs * tweak * black * inits * ruff * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP * init * no checkpoint * docs * ruff * type_vocab_size * tokenizer fixes * tokenizer fixes * Update tokenization_llama.py * Update tokenization_llama.py * Update configuration_llama.py * Update modeling_llama.py * tokenizer add_bos by default * licenses * remove decoder * norms and mlp * rope overhaul * tweaks * black * mention OPT implementation * off-by-one naming * typo * fix * tokenization fix and slicing bug * padding config * cleanup * black * update tests * undo typo * fix vocab caching logic * ruff * docbuilder * attn fix from BlackSamorez * initial feedback * typo * docs * llama case * llama case * load checkpoint docs * comment about tokenizer * tokenizer defaults * clear past_key_values if use_cache=False * last tweaks * last tweaks * last tweaks * last tweaks --------- Co-authored-by:Stella Biderman <stellabiderman@gmail.com>
-
- 28 Feb, 2023 1 commit
-
-
Yih-Dar authored
* Add PipelineTesterMixin * remove class PipelineTestCaseMeta * move validate_test_components * Add for ViT * Add to SPECIAL_MODULE_TO_TEST_MAP * style and quality * Add feature-extraction * update * raise instead of skip * add tiny_model_summary.json * more explicit * skip tasks not in mapping * add availability check * Add Copyright * A way to diable irrelevant tests * update with main * remove disable_irrelevant_tests * skip tests * better skip message * better skip message * Add all pipeline task tests * revert * Import PipelineTesterMixin * subclass test classes with PipelineTesterMixin * Add pipieline_model_mapping * Fix import after adding pipieline_model_mapping * Fix style and quality after adding pipieline_model_mapping * Fix one more import after adding pipieline_model_mapping * Fix style and quality after adding pipieline_model_mapping * Fix test issues * Fix import requirements * Fix mapping for MobileViTModelTest * Update * Better skip message * pipieline_model_mapping could not be None * Remove some PipelineTesterMixin * Fix typo * revert tests_fetcher.py * update * rename * revert * Remove PipelineTestCaseMeta from ZeroShotAudioClassificationPipelineTests * style and quality * test fetcher for all pipeline/model tests --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 06 Feb, 2023 1 commit
-
-
Sylvain Gugger authored
* Result of black 23.1 * Update target to Python 3.7 * Switch flake8 to ruff * Configure isort * Configure isort * Apply isort with line limit * Put the right black version * adapt black in check copies * Fix copies
-
- 09 Nov, 2022 1 commit
-
-
Joao Gante authored
* move generation_*.py src files into generation/*.py * populate generation.__init__ with lazy loading * move imports and references from generation.xxx.object to generation.object
-
- 03 May, 2022 1 commit
-
-
Yih-Dar authored
* move test model folders (TODO: fix imports and others) * fix (potentially partially) imports (in model test modules) * fix (potentially partially) imports (in tokenization test modules) * fix (potentially partially) imports (in feature extraction test modules) * fix import utils.test_modeling_tf_core * fix path ../fixtures/ * fix imports about generation.test_generation_flax_utils * fix more imports * fix fixture path * fix get_test_dir * update module_to_test_file * fix get_tests_dir from wrong transformers.utils * update config.yml (CircleCI) * fix style * remove missing imports * update new model script * update check_repo * update SPECIAL_MODULE_TO_TEST_MAP * fix style * add __init__ * update self-scheduled * fix add_new_model scripts * check one way to get location back * python setup.py build install * fix import in test auto * update self-scheduled.yml * update slack notification script * Add comments about artifact names * fix for yolos Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 23 Feb, 2022 1 commit
-
-
Lysandre Debut authored
* Per-folder tests reorganization Co-authored-by:
sgugger <sylvain.gugger@gmail.com> Co-authored-by:
Stas Bekman <stas@stason.org>
-
- 06 Jan, 2022 1 commit
-
-
Tavin Turner authored
-
- 21 Jul, 2021 1 commit
-
-
Lysandre Debut authored
* Expose get_config() on ModelTesters * Typo
-
- 13 Apr, 2021 1 commit
-
-
Sylvain Gugger authored
* Replace error by warning when loading an architecture in another * Style * Style again * Add a test * Adapt old test
-
- 08 Feb, 2021 1 commit
-
-
demSd authored
* claiming this issue * Integration test for BertGeneration(Encoder and Decoder) * fix code quality
-
- 23 Dec, 2020 1 commit
-
-
Suraj Patil authored
* add past_key_values * add use_cache option * make mask before cutting ids * adjust position_ids according to past_key_values * flatten past_key_values * fix positional embeds * fix _reorder_cache * set use_cache to false when not decoder, fix attention mask init * add test for caching * add past_key_values for Roberta * fix position embeds * add caching test for roberta * add doc * make style * doc, fix attention mask, test * small fixes * adress patrick's comments * input_ids shouldn't start with pad token * use_cache only when decoder * make consistent with bert * make copies consistent * add use_cache to encoder * add past_key_values to tapas attention * apply suggestions from code review * make coppies consistent * add attn mask in tests * remove copied from longformer * apply suggestions from code review * fix bart test * nit * simplify model outputs * fix doc * fix output ordering
-
- 07 Dec, 2020 1 commit
-
-
Sylvain Gugger authored
* Add copyright everywhere missing * Style
-
- 16 Nov, 2020 1 commit
-
-
Sylvain Gugger authored
* Use the CI to identify failing tests * Remove from all examples and tests * More default switch * Fixes * More test fixes * More fixes * Last fixes hopefully * Use the CI to identify failing tests * Remove from all examples and tests * More default switch * Fixes * More test fixes * More fixes * Last fixes hopefully * Run on the real suite * Fix slow tests
-
- 03 Nov, 2020 1 commit
-
-
Patrick von Platen authored
* first draft * show design proposition for new generate method * up * make better readable * make first version * gpt2 tests pass * make beam search for gpt2 work * add first encoder-decoder code * delete typo * make t5 work * save indermediate * make bart work with beam search * finish beam search bart / t5 * add default kwargs * make more tests pass * fix no bad words sampler * some fixes and tests for all distribution processors * fix test * fix rag slow tests * merge to master * add nograd to generate * make all slow tests pass * speed up generate * fix edge case bug * small fix * correct typo * add type hints and docstrings * fix typos in tests * add beam search tests * add tests for beam scorer * fix test rag * finish beam search tests * move generation tests in seperate file * fix generation tests * more tests * add aggressive generation tests * fix tests * add gpt2 sample test * add more docstring * add more docs * finish doc strings * apply some more of sylvains and sams comments * fix some typos * make fix copies * apply lysandres and sylvains comments * final corrections on examples * small fix for reformer
-
- 10 Sep, 2020 1 commit
-
-
Patrick von Platen authored
* add conversion script * improve conversion script * make style * add tryout files * fix * update * add causal bert * better names * add tokenizer file as well * finish causal_bert * fix small bugs * improve generate * change naming * renaming * renaming * renaming * remove leftover files * clean files * add fix tokenizer * finalize * correct slow test * update docs * small fixes * fix link * adapt check repo * apply sams and sylvains recommendations * fix import * implement Lysandres recommendations * fix logger warn
-