Commits · 0164560353aab22b9c7769d344f833ae8a16f478 · chenpangpang / transformers

08 Aug, 2024 14 commits

Fixed test `test_static_cache_exportability` with torch 2.4.0 (#32516) · 01645603
Guang Yang authored Aug 08, 2024
```
Workaround the export issue in torch 2.4
Co-authored-by: Guang Yang <guangyang@fb.com>
```
01645603

Fix generate with `inputs_embeds` as input (#32493) · 04428160

Pablo Montalvo authored Aug 08, 2024

* I think inputs_embeds has ndim == 3

* fix sequence length catch

* add generate test

* [run-slow]olmo, persimmon, gemma, gemma2, qwen2, llama

* skip whisper

* fix bart test

* more fixes

04428160

🌐

[i18n-KO] Translated `bitsandbytes.md` to Korean (#32408) · b01f9c48

SeungAhSon authored Aug 09, 2024



* docs: ko: quantization/bitsandbytes.md

* feat: nmt draft

* fix: minor typos

* fix: manual edits

* fix: manual edits

* fix: resolve suggestions
Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com>
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>

* fix: resolve suggestions
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------
Co-authored-by: wony617 <49024958+Jwaminju@users.noreply.github.com>
Co-authored-by: YONGSANG <71686691+4N3MONE@users.noreply.github.com>
Co-authored-by: Woojun Jung <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

b01f9c48

🌐

[i18n-KO] Translated `fsdp.md` to Korean (#32261) · 496207a1

SeungYoun Lee authored Aug 09, 2024



* docs: ko: fsdp.md

* feat: nmt draft

* fix: manual edits

* Apply suggestions from code review
Co-authored-by: 김준재 <55151385+junejae@users.noreply.github.com>
Co-authored-by: Minki Kim <100768622+1kmmk1@users.noreply.github.com>

* fix: resolve suggestions

* Update docs/source/ko/fsdp.md
Co-authored-by: 김준재 <55151385+junejae@users.noreply.github.com>

* Update docs/source/ko/fsdp.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------
Co-authored-by: 김준재 <55151385+junejae@users.noreply.github.com>
Co-authored-by: Minki Kim <100768622+1kmmk1@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

496207a1

🌐

[i18n-KO] Translated `eetq.md` to Korean (#32352) · e0396bda

HyeokJun SHIN authored Aug 09, 2024



* docs: ko: quantization/eetq.md

* feat: nmt draft

* fix docs: ko: quantization/eetq.md

* fix docs: ko: quantization/eetq.md

* fix: resolve suggestions
Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>

* fix: resolve suggestions

* fix: resolve suggsetions

---------
Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>

e0396bda

🌐

[i18n-KO] Translated `trainer.md` to Korean (#32260) · 96ba7f0c

Chulhwa (Evan) Han authored Aug 09, 2024



* docs: ko: ko-trainer

* feat: nmt draft

* fix: manual edits

* fix: manual edits

* fix: glossary

* fix: glossary

* Apply suggestions from code review
Co-authored-by: Jinuk <45095330+JinukHong@users.noreply.github.com>
Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>

---------
Co-authored-by: Jinuk <45095330+JinukHong@users.noreply.github.com>
Co-authored-by: SeongWooChoi <46990061+nuatmochoi@users.noreply.github.com>

96ba7f0c

🌐

[i18n-KO] Translated `ko-llm_tutorial_optimization.md` to Korean (#32372) · 43f3fe87

010kim authored Aug 09, 2024



* docs: ko: llm_tutorial_optimization.md

* feat: nmt draft

* fix: manual edits

* Update docs/source/ko/llm_tutorial_optimization.md
Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>

* Update docs/source/ko/llm_tutorial_optimization.md
Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>

* fix: resolve suggestions - 1
Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>
Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com>
Co-authored-by: boyunJang <gobook1234@naver.com>

* fix: resolve suggestions - 2
Co-authored-by: boyunJang <gobook1234@naver.com>
Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>
Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com>

---------
Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>
Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com>
Co-authored-by: boyunJang <gobook1234@naver.com>

43f3fe87

filter flash_attn optional imports loading remote code (#30954) · cc832cbd

Ekaterina Aidova authored Aug 08, 2024



* filter flash_attn optional imports loading remote code

* improve pattern

* fix code style

* Update src/transformers/dynamic_module_utils.py
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

---------
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

cc832cbd

Add Qwen2-Audio (#32137) · 16ed0640

Yunfei Chu authored Aug 08, 2024



* add qwen2audio

* Update check_repo.py

* fix style

* fix test

* fix style

* add model size

* Qwen2AudioEncoderModel->Qwen2AudioEncoder; add copy info

* Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* switch the attention_mask and the feature_attention_mask

* add to PRIVATE_MODELS in check_repo.py; add to MODEL_NAMES_TO_IGNORE in check_table.py

* fix initialization

* update chat_template

* fix consistency issue after copy

* add docstrings to _merge_input_ids_with_audio_features

* add copied from to prepare_inputs_for_generation

* add more details to docs

* rm comment

* add init_std

* Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* Update src/transformers/models/qwen2_audio/modeling_qwen2_audio.py
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>

* update

* Update docs/source/en/model_doc/qwen2_audio.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* update tests

* rm ignore_index

* update processor

* rm ffmpeg_read

* Update tests/models/qwen2_audio/test_modeling_qwen2_audio.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_audio.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_audio.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/model_doc/qwen2_audio.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* update

* typo

* [run_slow] qwen2_audio

* [run_slow] qwen2_audio

* [run_slow] qwen2_audio

* fix quality

* [run_slow] qwen2_audio

* [run_slow] qwen2_audio

* [run_slow] qwen2_audio

* add official model

---------
Co-authored-by: Yoach Lacombe <52246514+ylacombe@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

16ed0640

Fix add-new-model-like (#31773) · b51d4145

Pablo Montalvo authored Aug 08, 2024

* handle (processor_class, None) returned by ModelPatterns

* handle (slow, fast) image processors in add model

* handle old image processor case

b51d4145

Uniformize kwargs for processors - GroundingDINO (#31964) · d3b35517

Sangbum Daniel Choi authored Aug 08, 2024

* fix typo

* uniform kwargs

* make style

* add comments

* remove return_tensors

* remove common_kwargs from processor since it propagates

* make style

* return_token_type_ids to True

* revert the default imagekwargs since does not accept any value in the image processro

* revert processing_utils.py

* make style

* add molbap's commit

* fix typo

* fix common processor

* remain

* Revert "add molbap's commit"

This reverts commit a476c6ee88318ce40d73ea31e2dc2d4faa8ae410.

* add unsync PR

* revert

* make CI happy

* nit

* import annotationformat

d3b35517

Change Phi3 `_supports_sdpa` to True (#32457) · e28784f8
Wonseok Lee (Jack) authored Aug 08, 2024
```
* Change `_supports_sdpa` to True

* add phi3 to sdpa support list
```
e28784f8

Fix issue #32518: Update llm_tutorial.md (#32523) · 1c944ac1

doomdagadiggiedahdah authored Aug 08, 2024

Update llm_tutorial.md

remove comma re: issue 32518

https://github.com/huggingface/transformers/issues/32518

1c944ac1

Fix typo: depracted -> deprecated (#32489) · aefd3e2a

Tom Aarsen authored Aug 08, 2024

Hello!

## Pull Request overview
* Fix typo

## Details
This should speak for itself.

cc @itazap @ArthurZucker 

- Tom Aarsen

aefd3e2a

07 Aug, 2024 15 commits

Fix link to autoclass_tutorial.md in i18n.md (#32501) · f5cdbf6e
Francisco Kurucz authored Aug 07, 2024

f5cdbf6e

🌐

[i18n-KO] Translated `chat_templating.md` to Korean (#32362) · 78566dbd

Jiyoon authored Aug 08, 2024



* docs: ko: chat_templating.md

* feat: nmt draft

* fix: manual edits

* Update docs/source/ko/chat_templating.md
Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

* Update docs/source/ko/chat_templating.md
Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

* fix: apply suggestions from code review - anchor
Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>

* fix: manual edits
Co-authored-by: SeungYoun Lee <84276596+win2dvp21@users.noreply.github.com>
Co-authored-by: Minki Kim <100768622+1kmmk1@users.noreply.github.com>

* fix: manual edits

* fix: delete 'default template' section

---------
Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>
Co-authored-by: SeungYoun Lee <84276596+win2dvp21@users.noreply.github.com>
Co-authored-by: Minki Kim <100768622+1kmmk1@users.noreply.github.com>

78566dbd

Docs: Fixed WhisperModel.forward’s docstring link (#32498) · 543df489
Sai-Suraj-27 authored Aug 07, 2024
```
Fixed WhisperModel.forward’s docstring link.
```
543df489
Fix references to model google mt5 small (#32497) · 73a59a2f
Francisco Kurucz authored Aug 07, 2024

73a59a2f

🌐

[i18n-KO] Translated `image_feature_extraction.md` to Korean (#32239) · cba7bcf8

Jiwook Han authored Aug 08, 2024



* docs: ko: tasks/images_feature_extraction.md

* feat: nmt draft

* fix: manual edits

* fix: manual edits

* fix: manual edits

* fix: manual edits

* feat: manual edits

* Update docs/source/ko/tasks/image_feature_extraction.md
Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/tasks/image_feature_extraction.md
Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* fix: manual edits

---------
Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

cba7bcf8

🌐

[i18n-KO] Translated `quantization/quanto.md` to Korean (#32281) · fa59fd87

Sungmin Oh authored Aug 08, 2024



* docs: ko: quantization/quanto.md

* feat: nmt draft

* fix: resolve suggestions
Co-authored-by: SeungYoun Lee <84276596+win2dvp21@users.noreply.github.com>
Co-authored-by: Minki Kim <100768622+1kmmk1@users.noreply.github.com>
Co-authored-by: 김준재 <55151385+junejae@users.noreply.github.com>

* fix: resolve suggestions
Co-authored-by: SeungYoun Lee <84276596+win2dvp21@users.noreply.github.com>

---------
Co-authored-by: SeungYoun Lee <84276596+win2dvp21@users.noreply.github.com>
Co-authored-by: Minki Kim <100768622+1kmmk1@users.noreply.github.com>
Co-authored-by: 김준재 <55151385+junejae@users.noreply.github.com>

fa59fd87

🌐

[i18n-KO] Translated `prompting.md` to Korean (#32294) · fcc4f2ae

Chaewon Song authored Aug 08, 2024



* docs: ko: tasks/prompting.md

* feat: nmt-draft

* fix: update translation in prompting.md

* fix: update toctree.yml

* fix: manual edits

* fix: toctree edits

* fix: resolve suggestions
Co-authored-by: boyunJang <gobook1234@naver.com>
Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com>
Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com>

---------
Co-authored-by: boyunJang <gobook1234@naver.com>
Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com>
Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com>

fcc4f2ae

🌐

[i18n-KO] Translated `gptq.md` to Korean (#32293) · 1124d95d

Minki Kim authored Aug 08, 2024



* fix: manual edits

* fix: manual edits2

* fix: delete files

* fix: resolve suggestions
Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>
Co-authored-by: SeungYoun Lee <84276596+win2dvp21@users.noreply.github.com>
Co-authored-by: 김준재 <55151385+junejae@users.noreply.github.com>

* fix: resolve suggestions
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------
Co-authored-by: Sungmin Oh <fabxoe.kor@gmail.com>
Co-authored-by: SeungYoun Lee <84276596+win2dvp21@users.noreply.github.com>
Co-authored-by: 김준재 <55151385+junejae@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

1124d95d

Docs: alert for the possibility of manipulating logits (#32467) · b7fb393f
Joao Gante authored Aug 07, 2024
```
* logits

* words
```
b7fb393f

fix broken link in docs (#32491) · b6401030

Jonathan Rahn authored Aug 07, 2024

`https://huggingface.co/docs/transformers/en/main_classes/pipelines#transformers.TextGenerationPipeline.__call__`

`generate_kwargs (dict, optional) — Additional keyword arguments to pass along to the generate method of the model (see the generate method corresponding to your framework here).`

link in "here" doesnt work

b6401030

Agents use grammar (#31735) · e0d82534
Aymeric Roucher authored Aug 07, 2024
```
* Allow optional use of grammars to constrain generation
```
e0d82534
Fix typo in tokenization_utils_base.py (#32484) · c54a6f99
Bill Zhou authored Aug 07, 2024

c54a6f99
enable xla fsdp (#32048) · 46d09af4
append-only authored Aug 07, 2024
```
* enable xla fsdp

* add acceleration version check for xla fsdp
```
46d09af4

Gemma2: add cache warning (#32279) · 7ad784ae

Raushan Turganbay authored Aug 07, 2024



* gemma2 fallback to dynamic cache

* Update src/transformers/models/gemma2/modeling_gemma2.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Update src/transformers/models/gemma2/modeling_gemma2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* raise error and dont fallback to dynamic cache

* prev will break most forward calls/tests

* Update src/transformers/models/gemma2/modeling_gemma2.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update

* fix copies

---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

7ad784ae

Cache: new Cache format in decoder-only models (#31421) · a30c865f

Raushan Turganbay authored Aug 07, 2024



* draft bart with new cache

* add cache for decoder-only models

* revert utils

* modify docstring

* revert bart

* minor fixes

* fix copies (not related)

* revert tests

* remove enc-dec related code

* remove bloom

* remove opt (enc-dec)

* update docstring

* git, codegen, gpt_neo, gpt_neox, gpj

* clean up

* copied from statements

* revert

* tmp

* update warning msg

* forgot git

* add more flags

* run-slow git,codegen,gpt_neo,gpt_neox,gpj

* add cache flag to VLMs

* remove files

* style

* video LLMs also need a flag

* style

* llava will go in another PR

* style

* [run-slow] codegen, falcon, git, gpt_neo, gpt_neox, gptj, idefics

* Update src/transformers/models/gpt_neo/modeling_gpt_neo.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* copy from

* deprecate until v4.45 and warn if not training

* nit

* fix test

* test static cache

* add more tests and fix models

* fix copies

* return sliding window mask

* run slow tests & fix + codestyle

* one more falcon fix for alibi

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

a30c865f

06 Aug, 2024 11 commits

🌐

[i18n-KO] Translated `image_to_image.md` to Korean (#32327) · 6af0854e

HyunJi Shin authored Aug 07, 2024



* docs: ko: tasks/image_to_image.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions
Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>
Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>

* fix: handle remaining suggestions
Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>

---------
Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>
Co-authored-by: Jiwook Han <33192762+mreraser@users.noreply.github.com>

6af0854e

🌐

[i18n-KO] Translated `idefics.md` to Korean (#32258) · 3b193c7b

boyunJang authored Aug 07, 2024



* docs: ko: tasks/idefics.md

* feat: nmt draft

* fix: manual edits

* fix: resolve suggestions
Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>
Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com>
Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com>

---------
Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>
Co-authored-by: Harheem Kim <49297157+harheem@users.noreply.github.com>
Co-authored-by: timdalxx <48753785+jeongiin@users.noreply.github.com>

3b193c7b

🌐

[i18n-KO] Translated `mask_generation.md` to Korean (#32257) · 5301b981

timdalxx authored Aug 07, 2024



* docs: ko: tasks/mask_generation.md

* feat: nmt draft

* fix : toc local

* fix : manual edits

* fix : ko-toctree

* fix: resolve suggestions
Co-authored-by: boyunJang <gobook1234@naver.com>
Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>

* fix: resolve suggestions
Co-authored-by: boyunJang <gobook1234@naver.com>
Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>

* fix: resolve suggestions

* fix: resolve suggestions

* fix: resolve suggestions

---------
Co-authored-by: boyunJang <gobook1234@naver.com>
Co-authored-by: Chaewon Song <chaewon1019@ewhain.net>

5301b981

Revert "fixes to properly shard FSDP across cpu and meta for... · ac2707e8

Matthew Douglas authored Aug 06, 2024

Revert "fixes to properly shard FSDP across cpu and meta for cpu_effcient_loading for prequantized 4bit (#32276)" (#32477)

* Revert "fixes to properly shard FSDP across cpu and meta for cpu_efficient_loading for prequantized 4bit (#32276)"

This reverts commit 62c60a30

.

We uncovered an issue with this change that caused our training runs to hang.

* `is_torchdynamo_compiling` -- cast a wide exception net (#32476)

* cast a wide net

* make fix-copies with a few manual changes

* add copied from

---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

ac2707e8

`is_torchdynamo_compiling` -- cast a wide exception net (#32476) · 4fdc7020
Joao Gante authored Aug 06, 2024
```
* cast a wide net

* make fix-copies with a few manual changes

* add copied from
```
4fdc7020
dev version 4.45.0 · 26a9443d
Arthur Zucker authored Aug 06, 2024

26a9443d
Documentation: BOS token_id deprecation change for NLLB (#32443) · 50c3ba88
Chris Toukmaji authored Aug 06, 2024
```
Update nllb.md
```
50c3ba88

Migrate import checks not need accelerate, and be more clear on min versions (#32292) · 194cf1f3

Zach Mueller authored Aug 06, 2024

* Migrate import checks to secondary accelerate calls

* better errs too

* Revert, just keep the import checks + remove accelerate-specific things

* Rm extra'

* Empty commit for ci

* Small nits

* Final

194cf1f3

Add codestral mamba2 (#32080) · 80b90e7b

Pablo Montalvo authored Aug 06, 2024

* add new model like

* draft cuda forward - mismatched keys (sharding on conv1)

* match keys successfully

* fix split

* get generation/forward running (wrong gens, norm?)

* :update

* some refactoring

* fixes

* works up until copy to cache

* fix

* update

* NON WORKING VERSION

* version that work?

* nit

* fix config

* fix conversion script

* working cuda forward

* nit

* update

* simplifcation

* make mamba slow simple work

* no einops

* todo

* fix style

* no einops

* update fix no einsum

* nit

* remove einops

* bug: scan_output differs strongly

* add rms norm option

* fix fast + slow generation with and w/o cache ✔



* draft integration tests

* remove a big chunk of the einsum

* fix slow, fast generations, without any einsum

* fix copies

* fix structure

* fix up modeling and tests

* fix tests

* clamping is indeed worse

* recover mamba2 cache test

* fix copies

* no cache position (yet)

* fix tf tests

* fix matmul for generate

* fixup

* skip cache tests for now

* [run-slow]mamba2

* tune out hidden states for padding

* test batched generation

* propagate attention mask changes

* fix past length

* fix integration test

* style

* address comments

* update readme

* add mamba2 version check

* fix tests

* [run-slow]mamba2

* skip edge tests

* [run-slow]mamba2

* last fixup

* [run-slow]mamba2

* update README

---------
Co-authored-by: Arthur Zucker <arthur.zucker@gmail.com>

80b90e7b

Generate: fix end to end compilation (#32465) · 3d8bd119
Joao Gante authored Aug 06, 2024

3d8bd119

Add Nemotron HF Support (#31699) · 6a03942d

Ao Tang authored Aug 06, 2024

* Add nemotron support

* fix inference

* add unit test

* add layernorm1p as a class to avoid meta device mismatch

* test fixed

* Add copied_from statements

* remove pretraining_tp args

* remove nemotronlayernorm

* force LN computation done in FP32

* remove nemotrontokenizer and use llamatokenizer

* license update

* add option for kv_channels for minitron8b

* remove assert

* o_proj fixed

* o_proj reshape

* add gated_proj option

* typo

* remove todos

* fix broken test after merging latest main

* remove nezha/nat after meging main

* chnage default config to 15b model

* add nemo conversion script

* rename conversion script

* remove gate_proj option

* pr comment resolved

* fix unit test

* rename kv_channels to head_dim

* resolve PR issue

* add nemotron md

* fix broken tests

* refactor rope for nemotron

* test fix

* remove linearscaling

* whitespace and import

* fix some copied-from

* code style fix

* reformatted

* add position_embedding to nemotronattention

* rope refactor to only use config, copied-from fix

* format

* Run make fix-copies

* nemotron md with autodoc

* doc  fix

* fix order

* pass check_config_docstrings.py

* fix config_attributes

* remove all llama BC related code

* Use PreTrainedTokenizerFast

* ruff check examples

* conversion script update

* add nemotron to toctree

6a03942d