1. 06 Aug, 2024 1 commit
  2. 22 May, 2024 1 commit
  3. 20 May, 2024 1 commit
  4. 08 Apr, 2024 1 commit
    • Jonathan Tow's avatar
      [`StableLm`] Add QK normalization and Parallel Residual Support (#29745) · 2f12e408
      Jonathan Tow authored
      * init: add StableLm 2 support
      
      * add integration test for parallel residual and qk layernorm
      
      * update(modeling): match qk norm naming for consistency with phi/persimmon
      
      * fix(tests): run fwd/bwd on random init test model to jitter norm weights off identity
      
      * `use_parallel_residual`: add copy pointer to `GPTNeoXLayer.forward`
      
      * refactor: rename head states var in `StableLmLayerNormPerHead`
      
      * tests: update test model and add generate check
      2f12e408
  5. 29 Mar, 2024 1 commit
  6. 28 Mar, 2024 1 commit
  7. 21 Feb, 2024 1 commit
  8. 14 Feb, 2024 1 commit
    • Jonathan Tow's avatar
      Add `StableLM` (#28810) · de6029a0
      Jonathan Tow authored
      * Add `StableLM`
      
      * fix(model): re-create from `huggingface-cli add-new-model-like persimmon`
      
      * fix: re-add changes to address comments
      
      * fix(readme): add links to paper
      
      * fix(tokenization_auto): remove `GPTNeoXTokenizerFastFast` ref
      
      * fix(tests): re-add `@slow` decorator to integration tests
      
      * fix(tests): import slow...
      
      * fix(readme_hd): remove whitespace edit
      
      * fix(tokenizer): auto tokenizer tuple
      
      * skip doctests for `modeling_stablelm`
      de6029a0