Commits · c75969ee28eb293388bd587669e414c5a4f8d79f · chenpangpang / transformers

18 Jul, 2024 1 commit

Add torch.compile Support For Mamba (#31247) · c75969ee

Longjie Zheng authored Jul 18, 2024

* modify mamba cache

* set up cache

* add test

* [run-slow] mamba

* [run-slow] mamba

* address comments

* [run-slow] mamba

* use_cache_position

* [run-slow] mamba

* [run-slow] mamba

* [run-slow] mamba

* [run-slow] mamba

* fix

* cache in generate

* [run-slow] mamba

* address comments

* [run-slow] mamba

* [run-slow] mamba

* address comments

* [run-slow] mamba

* fix

* [run-slow] mamba

* fix

* [run-slow] mamba

* fix cache name

* [run-slow] mamba

c75969ee

10 Jul, 2024 1 commit
- fix: Removed `duplicate` field definitions in some classes (#31888) · da79b180
  Sai-Suraj-27 authored Jul 10, 2024
```
Removed duplicate field definitions in classes.
```
  da79b180
19 Jun, 2024 1 commit
- Mamba: add generative tests (#31478) · 83259e40
  Joao Gante authored Jun 19, 2024
  
  83259e40
06 Jun, 2024 1 commit
- Make mamba use cache (#31116) · 7729b774
  Raushan Turganbay authored Jun 06, 2024
```
* make mamba use cache

* uss cache naming as in mamba

* fix musicgen
```
  7729b774
27 Mar, 2024 1 commit

Mamba `slow_forward` gradient fix (#29563) · cefb819f

Anton Vlasjuk authored Mar 27, 2024

* FIX: Cached slow forward in mamba
- additionally added mamba cached test
- added unused test (mamba causal lm forward and backward)
- fixed typo: "causl" --> "causal"

* formatting

* fix: use real `slow_forward` call instead of torch module's

* add shape assertion for mixer block test

* adjust shape assertion

cefb819f

11 Mar, 2024 1 commit
- [`Mamba doc`] Post merge updates (#29472) · 4f27ee93
  Arthur authored Mar 11, 2024
```
* post merge update

* nit

* oups
```
  4f27ee93
05 Mar, 2024 1 commit

[`Add Mamba`] Adds support for the `Mamba` models (#28094) · fb1c62e9

Arthur authored Mar 05, 2024



* initial-commit

* start cleaning

* small nits

* small nits

* current updates

* add kernels

* small refactoring little step

* add comments

* styling

* nit

* nits

* Style

* Small changes

* Push dummy mambda simple slow

* nit

* Use original names

* Use original names and remove norm

* Updates for inference params

* Style nd updates

* nits

* Match logits

* Add a test

* Add expected generated text

* nits doc, imports and styling

* style

* oups

* dont install kernels, invite users to install the required kernels

* let use use the original packages

* styling

* nits

* fix some copieds

* update doc

* fix-copies

* styling done

* nits

* fix import check

* run but wrong cuda ress

* mamba CUDA works :)

* fix the fast path

* config naming nits

* conversion script is not required at this stage

* finish fixing the fast path: generation make sense now!

* nit

* Let's start working on the CIs

* style

* better style

* more nits

* test nit

* quick fix for now

* nits

* nit

* nit

* nit

* nits

* update test rest

* fixup

* update test

* nit

* some fixes

* nits

* update test values

* fix styling

* nit

* support peft

* integrations tests require torchg

* also add slow markers

* styling

* chose forward wisely

* nits

* update tests

* fix gradient checkpointing

* fixup

* nit

* fix doc

* check copies

* fix the docstring

* fix some more tests

* style

* fix beam search

* add init schene

* update

* nit

* fix

* fixup the doc

* fix the doc

* fixup

* tentative update but slow is no longer good

* nit

* should we always use float32?

* nits

* revert wrong changes

* res in float32

* cleanup

* skip fmt for now

* update generation values

* update test values running original model

* fixup

* update tests + rename inference_params to cache_params + make sure training does not use cache_params

* small nits

* more nits

* fix final CIs

* style

* nit doc

* I hope final doc nits

* nit

* 🫠

* final touch!

* fix torch import

* Apply suggestions from code review
Co-authored-by: Lysandre Debut <hi@lysand.re>

* Apply suggestions from code review

* fix fix and fix

* fix base model prefix!

* nit

* Update src/transformers/models/mamba/__init__.py

* Update docs/source/en/model_doc/mamba.md
Co-authored-by: Lysandre Debut <hi@lysand.re>

* nit

---------
Co-authored-by: Lysandre Debut <hi@lysand.re>

fb1c62e9