Commits · 995a7ce9a80b80062ccfe0b2d78857fb17351e27 · chenpangpang / transformers

"tests/mluke/__init__.py" did not exist on "dd4df80f0b77c8f8e07e502298df0121cada9ce8"

11 Jan, 2024 2 commits

Fix broken link on page (#28451) · 995a7ce9

Hankyeol Kyung authored Jan 12, 2024



* [docs] Fix broken link
Signed-off-by: Hankyeol Kyung <kghnkl0103@gmail.com>

* [docs] Use shorter domain
Signed-off-by: Hankyeol Kyung <kghnkl0103@gmail.com>

---------
Signed-off-by: Hankyeol Kyung <kghnkl0103@gmail.com>

995a7ce9

Doc (#28431) · 19e83d17

jiqing-feng authored Jan 12, 2024

* update version for cpu training

* update docs for cpu training

* fix readme

* fix readme

19e83d17

10 Jan, 2024 2 commits
- Fix load correct tokenizer in Mixtral model documentation (#28437) · 3724156b
  Francisco Kurucz authored Jan 10, 2024
  
  3724156b
- update docs to add the `phi-2` example (#28392) · fff8ca8e
  Susnato Dhar authored Jan 10, 2024
```
* update docs

* added Tip
```
  fff8ca8e
08 Jan, 2024 2 commits

Add SigLIP (#26522) · 3b742ea8

NielsRogge authored Jan 08, 2024



* Add first draft

* Use appropriate gelu function

* More improvements

* More improvements

* More improvements

* Convert checkpoint

* More improvements

* Improve docs, remove print statements

* More improvements

* Add link

* remove unused masking function

* begin tokenizer

* do_lower_case

* debug

* set split_special_tokens=True

* Remove script

* Fix style

* Fix rebase

* Use same design as CLIP

* Add fast tokenizer

* Add SiglipTokenizer to init, remove extra_ids

* Improve conversion script

* Use smaller inputs in conversion script

* Update conversion script

* More improvements

* Add processor to conversion script

* Add tests

* Remove print statements

* Add tokenizer tests

* Fix more tests

* More improvements related to weight initialization

* More improvements

* Make more tests pass

* More improvements

* More improvements

* Add copied from

* Add canonicalize_text

* Enable fast tokenizer tests

* More improvements

* Fix most slow tokenizer tests

* Address comments

* Fix style

* Remove script

* Address some comments

* Add copied from to tests

* Add more copied from

* Add more copied from

* Add more copied from

* Remove is_flax_available

* More updates

* Address comment

* Remove SiglipTokenizerFast for now

* Add caching

* Remove umt5 test

* Add canonicalize_text inside _tokenize, thanks Arthur

* Fix image processor tests

* Skip tests which are not applicable

* Skip test_initialization

* More improvements

* Compare pixel values

* Fix doc tests, add integration test

* Add do_normalize

* Remove causal mask and leverage ignore copy

* Fix attention_mask

* Fix remaining tests

* Fix dummies

* Rename temperature and bias

* Address comments

* Add copied from to tokenizer tests

* Add SiglipVisionModel to auto mapping

* Add copied from to image processor tests

* Improve doc

* Remove SiglipVisionModel from index

* Address comments

* Improve docs

* Simplify config

* Add first draft

* Make it like mistral

* More improvements

* Fix attention_mask

* Fix output_attentions

* Add note in docs

* Convert multilingual model

* Convert large checkpoint

* Convert more checkpoints

* Add pipeline support, correct image_mean and image_std

* Use padding=max_length by default

* Make processor like llava

* Add code snippet

* Convert more checkpoints

* Set keep_punctuation_string=None as in OpenCLIP

* Set normalized=False for special tokens

* Fix doc test

* Update integration test

* Add figure

* Update organization

* Happy new year

* Use AutoModel everywhere

---------
Co-authored-by: patil-suraj <surajp815@gmail.com>

3b742ea8

Add segmentation map processing to SAM Image Processor (#27463) · 73c88012

Rosie Wood authored Jan 08, 2024



* add segmentation map processing to sam image processor

* fixup

* add tests

* reshaped_input_size is shape before padding

* update tests for size/shape outputs

* fixup

* add code snippet to docs

* Update docs/source/en/model_doc/sam.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Add missing backticks

* add `segmentation_maps` as arg for SamProcessor.__call__()

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

73c88012

05 Jan, 2024 1 commit
- chore: Fix typo s/exclusivelly/exclusively/ (#28361) · 4ab5fb89
  hugo-syn authored Jan 05, 2024
  
  4ab5fb89
04 Jan, 2024 1 commit

README: install transformers from conda-forge channel (#28313) · 5d36025c

Kevin Herro authored Jan 04, 2024

Switch to the conda-forge channel for transformer installation,
as the huggingface channel does not offer the latest version.

Fixes #28248

5d36025c

03 Jan, 2024 2 commits

Add FastSpeech2Conformer (#23439) · d83ff5ee

Connor Henderson authored Jan 03, 2024

* start - docs, SpeechT5 copy and rename

* add relevant code from FastSpeech2 draft, have tests pass

* make it an actual conformer, demo ex.

* matching inference with original repo, includes debug code

* refactor nn.Sequentials, start more desc. var names

* more renaming

* more renaming

* vocoder scratchwork

* matching vocoder outputs

* hifigan vocoder conversion script

* convert model script, rename some config vars

* replace postnet with speecht5's implementation

* passing common tests, file cleanup

* expand testing, add output hidden states and attention

* tokenizer + passing tokenizer tests

* variety of updates and tests

* g2p_en pckg setup

* import structure edits

* docstrings and cleanup

* repo consistency

* deps

* small cleanup

* forward signature param order

* address comments except for masks and labels

* address comments on attention_mask and labels

* address second round of comments

* remove old unneeded line

* address comments part 1

* address comments pt 2

* rename auto mapping

* fixes for failing tests

* address comments part 3 (bart-like, train loss)

* make style

* pass config where possible

* add forward method + tests to WithHifiGan model

* make style

* address arg passing and generate_speech comments

* address Arthur comments

* address Arthur comments pt2

* lint  changes

* Sanchit comment

* add g2p-en to doctest deps

* move up self.encoder

* onnx compatible tensor method

* fix is symbolic

* fix paper url

* move models to espnet org

* make style

* make fix-copies

* update docstring

* Arthur comments

* update docstring w/ new updates

* add model architecture images

* header size

* md wording update

* make style

d83ff5ee

fix documentation for zero_shot_object_detection (#28267) · 6eba901d
lain authored Jan 03, 2024
```
remove broken space
```
6eba901d

02 Jan, 2024 1 commit
- Update docs around mixing hf scheduler with deepspeed optimizer (#28223) · cad9f5c6
  Dean Wyatte authored Jan 02, 2024
```
update docs around mixing hf scheduler with deepspeed optimizer
```
  cad9f5c6
22 Dec, 2023 3 commits

Fixing visualization code for object detection to support both types of bounding box. (#27842) · 74d9d0ce

Anindyadeep authored Dec 22, 2023



* fix: minor enhancement and fix in bounding box visualization example

The example that was trying to visualize the bounding box was not considering an edge case,
where the bounding box can be un-normalized. So using the same set of code, we can not get
results with a different dataset with un-normalized bounding box. This commit fixes that.

* run make clean

* add an additional note on the scenarios where the box viz code works

---------
Co-authored-by: Anindyadeep <anindya@pop-os.localdomain>

74d9d0ce

Update `docs/source/en/perf_infer_gpu_one.md` (#28198) · 71f46057
Yih-Dar authored Dec 22, 2023
```
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
```
71f46057
[`Docs`] Add 4-bit serialization docs (#28182) · 3a8769f6
Younes Belkada authored Dec 22, 2023
```
* add 4-bit serialization docs

* up

* up
```
3a8769f6

20 Dec, 2023 3 commits

Generate: fix speculative decoding (#28166) · 45b70384
Joao Gante authored Dec 20, 2023
```
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
```
45b70384
[docs] Trainer docs (#28145) · 01c081d1
Steven Liu authored Dec 20, 2023
```
* fsdp, debugging, gpu selection

* fix hfoption

* fix
```
01c081d1

Fix FA2 integration (#28142) · def581ef

Sourab Mangrulkar authored Dec 20, 2023



* fix fa2

* fix FA2 for popular models

* improve warning and add Younes as co-author
Co-Authored-By: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix the warning

* Add Tip

* typo fix

* nit

---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

def581ef

19 Dec, 2023 1 commit
- [docs] Fix mistral link in mixtral.md (#28143) · 38611086
  Aaron Jimenez authored Dec 19, 2023
```
Fix mistral link in mixtral.md
```
  38611086
18 Dec, 2023 4 commits
- [Doc] Fix token link in What 🤗 Transformers can do (#28123) · 4edffda6
  Aaron Jimenez authored Dec 18, 2023
```
Fix token link
```
  4edffda6
- [docs] General doc fixes (#28087) · a52e180a
  Steven Liu authored Dec 18, 2023
```
* doc fix friday

* deprecated objects

* update not_doctested

* update toctree
```
  a52e180a
- Fix indentation error - semantic_segmentation.md (#28117) · 08a6e7a7
  Rockerz authored Dec 18, 2023
```
Update semantic_segmentation.md
```
  08a6e7a7
- Spelling correction (#28110) · 7f2a8f92
  Aeneas Stankowski authored Dec 18, 2023
```
Update mixtral.md

correct minor typo in overview
```
  7f2a8f92
15 Dec, 2023 3 commits
- [docs] MPS (#28016) · ebfdb9ca
  Steven Liu authored Dec 15, 2023
```
* mps docs

* toctree
```
  ebfdb9ca
- [docs] Trainer (#27986) · 0d63d177
  Steven Liu authored Dec 15, 2023
```
* first draft

* add to toctree

* edits

* feedback
```
  0d63d177
- Fix Vip-llava docs (#28085) · 1faeff85
  Younes Belkada authored Dec 15, 2023
```
* Update vipllava.md

* Update modeling_vipllava.py
```
  1faeff85
14 Dec, 2023 1 commit
- [Seamless] Fix links in docs (#27905) · 52c37882
  Sanchit Gandhi authored Dec 14, 2023
```
* [Seamless] Fix links in docs

* apply suggestions from code review
```
  52c37882
13 Dec, 2023 2 commits

[Doc] Spanish translation of glossary.md (#27958) · 815ea8e8

Aaron Jimenez authored Dec 13, 2023

* Add glossary to es/_toctree.yml

* Add glossary.md to es/

* A section translated

* B and C section translated

* Fix typo in en/glossary.md C section

* D section translated | Add a extra line in en/glossary.md

* E and F section translated | Fix typo in en/glossary.md

* Fix words preentrenado

* H and I section translated | Fix typo in en/glossary.md

* L section translated

* M and N section translated

* P section translated

* R section translated

* S section translated

* T section translated

* U and Z section translated | Fix TensorParallel link in both files

* Fix word

815ea8e8

Adds VIP-llava to transformers (#27932) · c7f076a0

Younes Belkada authored Dec 13, 2023

* v1

* add-new-model-like

* revert

* fix forward and conversion script

* revert

* fix copies

* fixup

* fix

* Update docs/source/en/index.md

* Apply suggestions from code review

* push

* fix

* fixes here and there

* up

* fixup and fix tests

* Apply suggestions from code review

* add docs

* fixup

* fixes

* docstring

* add docstring

* fixup

* docstring

* fixup

* nit

* docs

* more copies

* fix copies

* nit

* update test

c7f076a0

12 Dec, 2023 1 commit
- [doc] fix typo (#27981) · 99361430
  Stas Bekman authored Dec 12, 2023
  
  99361430
11 Dec, 2023 7 commits

fixed typos (issue 27919) (#27920) · e6604247

Anthony Susevski authored Dec 11, 2023



* fixed typos (issue 27919)

* Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

e6604247

[docs] Fused AWQ modules (#27896) · 35478182
Steven Liu authored Dec 11, 2023
```
streamline
```
35478182
Update bounding box format everywhere (#27944) · 67b1335c
NielsRogge authored Dec 11, 2023
```
Update formats
```
67b1335c
Fix parameter count in readme for mixtral 45b (#27945) · 5cec306c
Timon Käch authored Dec 11, 2023
```
fix parameter count in readme
```
5cec306c

Docs for AutoBackbone & Backbone (#27456) · b911c1f1

Merve Noyan authored Dec 11, 2023



* Initial commit for AutoBackbone & Backbone

* Added timm and clarified out_indices

* Swapped the example to out_indices

* fix toctree

* Update autoclass_tutorial.md

* Update backbones.md

* Update autoclass_tutorial.md

* Add dummy torch input instead

* Add dummy torch input

* Update autoclass_tutorial.md

* Update backbones.md

* minor fix

* Update docs/source/en/main_classes/backbones.md
Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update docs/source/en/autoclass_tutorial.md
Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Added illustrations and explained backbone & neck

* Update docs/source/en/main_classes/backbones.md
Co-authored-by: Maria Khalusova <kafooster@gmail.com>

* Update backbones.md

---------
Co-authored-by: Maria Khalusova <kafooster@gmail.com>

b911c1f1

[`Add Mixtral`] Adds support for the Mixtral MoE (#27942) · accccdd0

Arthur authored Dec 11, 2023



* up

* up

* test

* logits ok

* up

* up

* few fixes

* conversion script

* up

* nits

* nits

* update

* nuke

* more updates

* nites

* fix many issues

* nit

* scatter

* nit

* nuke megablocks

* nits

* fix conversion script

* nit

* remove

* nits

* nit

* update

* oupsssss

* change

* nits device

* nits

* fixup

* update

* merge

* add copied from

* fix the copy mentions

* update tests

* more fixes

* nits

* conversion script

* add parts of the readme

* Update tests/models/mixtral/test_modeling_mixtral.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* new test + conversion script

* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Apply suggestions from code review

* fix

* fix copies

* fix copies

* ooops

* fix config

* Apply suggestions from code review

* fix nits

* nit

* add copies

* add batched tests

* docs

* fix flash attention

* let's add more verbose

* add correct outputs

* support router ouptus

* ignore copies where needed

* fix

* cat list if list is given for now

* nits

* Update docs/source/en/model_doc/mixtral.md

* finish router refactoring

* fix forward

* fix expected values

* nits

* fixup

* fix

* fix bug

* fix

* fix dtype mismatch

* fix

* grrr grrr I support item assignment

* fix CI

* docs

* fixup

* remove some copied form

* fix weird diff

* skip doctest fast on the config and modeling

* mark that is supports flash attention in the doc

* update

* Update src/transformers/models/mixtral/modeling_mixtral.py
Co-authored-by: Lysandre Debut <hi@lysand.re>

* Update docs/source/en/model_doc/mixtral.md
Co-authored-by: Lysandre Debut <hi@lysand.re>

* revert router logits config issue

* update doc accordingly

* Update src/transformers/models/mixtral/convert_mixtral_weights_to_hf.py

* nits

* use torch testing asssert close

* fixup

* doc nits

---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>

accccdd0

[LLaVa] Some improvements (#27895) · 7ea21f1f
NielsRogge authored Dec 11, 2023
```
* More improvements

* Improve variable names

* Update READMEs, improve docs
```
7ea21f1f

08 Dec, 2023 3 commits

F.scaled_dot_product_attention support (#26572) · 80377eb0

fxmarty authored Dec 08, 2023



* add sdpa

* wip

* cleaning

* add ref

* yet more cleaning

* and more :)

* wip llama

* working llama

* add output_attentions=True support

* bigcode sdpa support

* fixes

* gpt-bigcode support, require torch>=2.1.1

* add falcon support

* fix conflicts falcon

* style

* fix attention_mask definition

* remove output_attentions from attnmaskconverter

* support whisper without removing any Copied from statement

* fix mbart default to eager renaming

* fix typo in falcon

* fix is_causal in SDPA

* check is_flash_attn_2_available in the models init as well in case the model is not initialized through from_pretrained

* add warnings when falling back on the manual implementation

* precise doc

* wip replace _flash_attn_enabled by config.attn_implementation

* fix typo

* add tests

* style

* add a copy.deepcopy on the config in from_pretrained, as we do not want to modify it inplace

* obey to config.attn_implementation if a config is passed in from_pretrained

* fix is_torch_sdpa_available when torch is not installed

* remove dead code

* Update src/transformers/modeling_attn_mask_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_attn_mask_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_attn_mask_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_attn_mask_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_attn_mask_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/bart/modeling_bart.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove duplicate pretraining_tp code

* add dropout in llama

* precise comment on attn_mask

* add fmt: off for _unmask_unattended docstring

* precise num_masks comment

* nuke pretraining_tp in LlamaSDPAAttention following Arthur's suggestion

* cleanup modeling_utils

* backward compatibility

* fix style as requested

* style

* improve documentation

* test pass

* style

* add _unmask_unattended tests

* skip meaningless tests for idefics

* hard_check SDPA requirements when specifically requested

* standardize the use if XXX_ATTENTION_CLASSES

* fix SDPA bug with mem-efficient backend on CUDA when using fp32

* fix test

* rely on SDPA is_causal parameter to handle the causal mask in some cases

* fix FALCON_ATTENTION_CLASSES

* remove _flash_attn_2_enabled occurences

* fix test

* add OPT to the list of supported flash models

* improve test

* properly test on different SDPA backends, on different dtypes & properly handle separately the pad tokens in the test

* remove remaining _flash_attn_2_enabled occurence

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/modeling_attn_mask_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update docs/source/en/perf_infer_gpu_one.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* remove use_attn_implementation

* fix docstring & slight bug

* make attn_implementation internal (_attn_implementation)

* typos

* fix tests

* deprecate use_flash_attention_2=True

* fix test

* add back llama that was removed by mistake

* fix tests

* remove _flash_attn_2_enabled occurences bis

* add check & test that passed attn_implementation is valid

* fix falcon torchscript export

* fix device of mask in tests

* add tip about torch.jit.trace and move bt doc below sdpa

* fix parameterized.expand order

* move tests from test_modeling_attn_mask_utils to test_modeling_utils as a relevant test class is already there

* update sdpaattention class with the new cache

* Update src/transformers/configuration_utils.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/bark/modeling_bark.py

* address review comments

* WIP torch.jit.trace fix. left: test both eager & sdpa

* add test for torch.jit.trace for both eager/sdpa

* fix falcon with torch==2.0 that needs to use sdpa

* fix doc

* hopefully last fix

* fix key_value_length that has no default now in mask converter

* is it flacky?

* fix speculative decoding bug

* tests do pass

* fix following #27907

---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

80377eb0

[Doc] Spanish translation of pad_truncation.md (#27890) · d6c3a3f1

Aaron Jimenez authored Dec 08, 2023

* Add pad_truncation to es/_toctree.yml

* Add pad_truncation.md to es/

* Translated first two paragraph

* Translated paddig argument section

* Translated truncation argument section

* Translated final paragraphs

* Translated table

* Fixed typo in the table of en/pad_truncation.md

* Run make style | Fix a word

* Add Padding (relleno) y el Truncation (truncamiento) in the final paragraphs

* Fix relleno and truncamiento words

d6c3a3f1

Generate: New `Cache` abstraction and Attention Sinks support (#26681) · 633215ba

Tom Aarsen authored Dec 08, 2023

* Draft version of new KV Caching

This should allow Attention Sinks (https://github.com/tomaarsen/attention_sinks)
/ StreamingLLM (https://arxiv.org/abs/2309.17453) to be easily implemented
in a third-party or in transformers directly

* Address numerous PR suggestions

1. Move layer_idx from cache to ...Attention. Removes confusing set_layer_idx magic.
2. Always convert past_key_values to Cache instance at the start of ...Attention, removes all other isinstance calls.
3. Remove __bool__ and __getitem__ magic as they're confusing.
4. past_key_values.update(key, value, idx) now returns key, value.
5. Add use_legacy_cache flag, defaults to None, i.e. Falsey. This breaks generate for now, until 1) the cache is used is generate() or 2) use_legacy_cache is defaulted to True in generate() until we change it in another PR.
6. Separate key_cache and value_cache.

Some work is still needed to see if the SinkCache can conveniently be implemented with just one update method.

* Implement the SinkCache through backward+forward rotations

* Integrate (Sink)Cache with Llama FA2

* Set use_legacy_cache=True as default, allows for test passes

* Move from/to_legacy_cache to ...Model class

* Undo unnecessary newline change

* Remove copy utility from deprecated OpenLlama

* Match import style

* manual rebase with main

* Cache class working with generate (#1)

* Draft version of new KV Caching

This should allow Attention Sinks (https://github.com/tomaarsen/attention_sinks)
/ StreamingLLM (https://arxiv.org/abs/2309.17453

) to be easily implemented
in a third-party or in transformers directly

* Address numerous PR suggestions

Some work is still needed to see if the SinkCache can conveniently be implemented with just one update method.

* Integrate (Sink)Cache with Llama FA2

* Move from/to_legacy_cache to ...Model class

* Undo unnecessary newline change

* Match import style

* working generate

* Add tests; Simplify code; Apply changes to Mistral and Persimmon

* fix rebase mess

* a few more manual fixes

* last manual fix

* propagate changes to phi

* upgrade test

* add use_legacy_cache docstring; beef up tests

* reintroduce unwanted deletes

---------
Co-authored-by: Tom Aarsen <Cubiegamedev@gmail.com>

* move import

* add default to model_kwargs.get('use_legacy_cache')

* correct failing test

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* apply PR suggestions

* fix failing test

* Apply suggestions from code review
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Tom Aarsen <37621491+tomaarsen@users.noreply.github.com>

* PR comments

* tmp commit

* add docstrings

* more tests, more docstrings, add to docs

* derp

* tmp commit

* tmp dbg

* more dbg

* fix beam search bug

* cache can be a list of tuples in some models

* fix group beam search

* all but sinkcache integration tests

* fix sink cache and add hard integration test

* now also compatible with input_embeds input

* PR comments

* add Cache support to Phi+FA2

* make fixup

---------
Co-authored-by: Joao Gante <joao@huggingface.co>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

633215ba

07 Dec, 2023 1 commit

Updates the distributed CPU training documentation to add instructions for... · 79b79ae2

Dina Suehiro Jones authored Dec 07, 2023

Updates the distributed CPU training documentation to add instructions for running on a Kubernetes cluster (#27780)

* Updates the Distributed CPU documentation to add a Kubernetes example

* Small edits

* Fixing link

* Adding missing new lines

* Minor edits

* Update to include Dockerfile snippet

* Add comment about tuning env var

* Updates based on review comments

79b79ae2