Commits · 40e534a7f6cc751a7b111df37b04b885d2cef402 · gaoqiong / flash-attention

11 Jul, 2024 3 commits
- Implement cache_leftpad · 40e534a7
  Tri Dao authored Jul 11, 2024
  
  40e534a7
- [CI] Compile with pytorch 2.4.0.dev20240514 · 116b05f9
  Tri Dao authored Jul 11, 2024
  
  116b05f9
- Bump v2.6.0 · da11d1b8
  Tri Dao authored Jul 10, 2024
  
  da11d1b8
10 Jul, 2024 8 commits
- Relax dropout_fraction test · d0787acc
  Tri Dao authored Jul 10, 2024
  
  d0787acc
- Don't support softcap and dropout at the same time · dca6d89d
  Tri Dao authored Jul 10, 2024
```
These tests are failing so I'm just disabling this case for now
```
  dca6d89d
- More typo fixes · 81e01efd
  Tri Dao authored Jul 10, 2024
  
  81e01efd
- Fix typo with softcapping · 72e27c63
  Tri Dao authored Jul 10, 2024
  
  72e27c63
- Only test backward if there's no softcapping · 3d41db3e
  Tri Dao authored Jul 10, 2024
  
  3d41db3e
- Split into more .cu files to speed up compilation · 908511b2
  Tri Dao authored Jul 10, 2024
  
  908511b2
- Minor cleanup of softcapping · 1d536d7d
  Tri Dao authored Jul 09, 2024
  
  1d536d7d
- Drop support for pytorch 1.12, 1.13, and python 3.7 · beb2bf2a
  Tri Dao authored Jul 09, 2024
  
  beb2bf2a
09 Jul, 2024 1 commit
- missing commas and backwards return arguments (#1032) · f4628b43
  Phil Wang authored Jul 09, 2024
```
* missing commas

* another fix
```
  f4628b43
08 Jul, 2024 2 commits

Implement softcapping. (#1025) · 8f873cc6
Nicolas Patry authored Jul 08, 2024
```
* Softcap v2 (fwd only).

* Some missing interface + remove overrides in tests.
```
8f873cc6

Add the return_softmax_lse parameter to the flash_attn_with_kvcache function... · 4e8d6006

Jianwei Dong authored Jul 08, 2024

Add the return_softmax_lse parameter to the flash_attn_with_kvcache function to allow returning the logsumexp of the attention scores. (#989)

4e8d6006

03 Jul, 2024 1 commit
- Fix the varlen deterministic test (#1023) · 6df7e0a0
  muoshuosha authored Jul 04, 2024
```
Co-authored-by: moshuosha <moshuosha@qq.com>
```
  6df7e0a0
01 Jul, 2024 5 commits

Fix typos of comments about shape. (#837) · 9486635c
66RING authored Jul 01, 2024

9486635c

Fix KeyError handling for non-existing key in state_dict.pop() (#898) · 0d810cfb

JDKWangGuan authored Jun 30, 2024

Update handling for KeyError in state_dict.pop() for non-existing keys.
Changed state_dict.pop(f"h.{d}.attn.bias") to state_dict.pop(f"h.{d}.attn.bias", None) to prevent KeyError exceptions.


The following code can re-produce the issue
```
from transformers import AutoTokenizer, GPT2Model, GPT2Config
from flash_attn.models.gpt import GPTLMHeadModel, GPTModel

# >>> transformers.__version__
# '4.38.2'

model_path = 'gpt2'
output_model_path = 'gpt2_model'
config = GPT2Config.from_pretrained(model_path, output_hidden_states=True)
model = GPT2Model.from_pretrained(model_path, from_tf=False, config=config)
'''
model fine-tuning here
'''
# dump the fine-tuned model
model.save_pretrained(output_model_path)

# load the fine-tuned model
config = GPT2Config.from_pretrained(output_model_path, output_hidden_states=True)
model = GPTModel.from_pretrained(output_model_path, config=config, strict=True)  # failed due to KeyError: 'h.0.attn.bias'
model = GPTLMHeadModel.from_pretrained(output_model_path, config=config, strict=True)  # failed due to KeyError: 'h.0.attn.bias'

```

0d810cfb

fix typo (#974) · 6a2a16e9
cao lei authored Jun 30, 2024

6a2a16e9

Fixing argument checking when using `seqlenq_ngroups_swapped`. (#976) · 5bf20196

Nicolas Patry authored Jul 01, 2024

When user send `out` as a parameter of the function
`seqlenq_ngroups_swapped` with parameters that trigger,
the CHECK_SHAPE is incorrect (since q shape is modified.)

5bf20196

remove swizzle part of `sV.data()` to get a completely non-swizzle `sVtNoSwizzle` (#984) · ab59ec35
Liang authored Jul 01, 2024
```
Co-authored-by: zl <zl@deepseek.com>
```
ab59ec35

27 Jun, 2024 1 commit

Support unpadded LSE layout (#970) · f816dee6

Grigory Sizov authored Jun 27, 2024



* Support unpadded LSE layout.
Co-authored-by: Xinfeng Xie <xfxie.ceca@gmail.com>
Co-authored-by: Jianyu Huang <hjyahead@gmail.com>

* Cleanup

* Fix unpadded LSE on split-kv path

* Fix formatting and comments

* Fix inline vs forceinline

---------
Co-authored-by: Xinfeng Xie <xfxie.ceca@gmail.com>
Co-authored-by: Jianyu Huang <hjyahead@gmail.com>

f816dee6

26 May, 2024 7 commits
- Update citation · 320fb594
  Tri Dao authored May 26, 2024
  
  320fb594
- Limit to MAX_JOBS=1 with CUDA 12.2 · e2e4333c
  Tri Dao authored May 26, 2024
  
  e2e4333c
- Bump to 2.5.9 · ce735035
  Tri Dao authored May 26, 2024
  
  ce735035
- Update to Cutlass 3.5 · d732be1e
  Tri Dao authored May 26, 2024
  
  d732be1e
- [CI] Compile for pytorch 2.4.0.dev20240407 (for nvcr 24.05) · af627063
  Tri Dao authored May 26, 2024
  
  af627063
- Update for python3.12 (#870) · 40e66723
  Wongboo authored May 27, 2024
  
  40e66723
- add exception to Timeout Error (#963) · beb8b8ba
  Corey James Levinson authored May 26, 2024
```
When timeout connecting, you get URLError: <urlopen error timed out>, In that case, build it from source.
```
  beb8b8ba
23 May, 2024 1 commit
- remove an unused import (#960) · 22339db1
  lancerts authored May 23, 2024
  
  22339db1
06 May, 2024 1 commit
- Move packaging and ninja from install_requires to setup_requires (#937) · 9c0e9ee8
  Wei Ji authored May 07, 2024
```
Set `packaging` and `ninja` as build time dependencies rather than runtime dependencies.
```
  9c0e9ee8
26 Apr, 2024 3 commits
- Bump to v2.5.8 · 9a11f440
  Tri Dao authored Apr 26, 2024
  
  9a11f440
- [CI] Compile for pytorch 2.2.2 and 2.3.0 · 35060e74
  Tri Dao authored Apr 26, 2024
  
  35060e74
- [CrossEntropy] Change ignored_index -> ignore_index · ec6d2214
  Tri Dao authored Apr 26, 2024
  
  ec6d2214
08 Apr, 2024 4 commits
- Bump to v2.5.7 · 85881f54
  Tri Dao authored Apr 07, 2024
  
  85881f54
- [CI] Compile with torch 2.3.0.dev20240207 · 2aea958f
  Tri Dao authored Apr 07, 2024
  
  2aea958f
- Use Cute's local_tile to get gQ, gK, gV · 656daef4
  Tri Dao authored Apr 07, 2024
  
  656daef4
- Transpose out when swapping seqlen_q and num_groups · 9eb3d099
  Tri Dao authored Apr 07, 2024
  
  9eb3d099
05 Apr, 2024 1 commit

Fix spurious re-compilations of `rotary_kernel` (#911) · f692b98d

Ivan Komarov authored Apr 05, 2024

All integer parameters are specialized by default, so the two parameters
removed in this commit could lead to kernel re-compilation, even if
they were completely unused.

f692b98d

28 Mar, 2024 2 commits
- Add the option for the macro and note (#893) · 23e8fa5a
  Driss Guessous authored Mar 27, 2024
  
  23e8fa5a
- Minor fix in compute_attn_1rowblock_splitkv (#900) · 3e9414f1
  ljss authored Mar 28, 2024
  
  3e9414f1