- 17 Mar, 2023 7 commits
-
-
lewtun authored
* Add LlamaForSequenceClassification * Update src/transformers/models/llama/modeling_llama.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/llama/modeling_llama.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Add docstring * Add test * Add input embedding getter and setter * Remove dead code --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
Wang, Yi authored
* fix AutoTP in deepspeed could not work for bloom Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> * add a method in BloomModel to build ailib Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com>
-
Sylvain Gugger authored
* LLaMA house-keeping * Doc links
-
Maria Khalusova authored
* added doc to toc, auto tip with supported models, mention of task guide in model docs * make style * removed "see also" * minor fix
-
Yih-Dar authored
Use dash 2.8.1 for now Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
wangpeng authored
Co-authored-by:yue kun <yuekun.wp@alibaba-inc.com>
-
Kevin Turner authored
-
- 16 Mar, 2023 12 commits
-
-
Yih-Dar authored
fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
* py38 + torch 2 * increment cache versions --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Susnato Dhar authored
* fixes a typo * .
-
Younes Belkada authored
* add `accelerate` support for XGLM * fix order
-
SatyaJandhyalaAtMS authored
* Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143 * Reduced column width * Fix formatting. * Revert "Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143" This reverts commit 6e95a108042118d204da447729f3834affa354fc. * Fix export error. * Revert "Fix formatting." This reverts commit 8310f60da10358edbdf77a2a2f3c83ee55066cb8. * Propagated changes made in SwinV2 to Swin2SR
-
Yih-Dar authored
* Update UNCONVERTIBLE_MODEL_ARCHITECTURES * Deal with 2 model tester classes in single test file * Deal with 2 model tester classes in single test file * Deal with 2 model tester classes in single test file * make style and quality --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Jason Phang authored
* LLaMA * sharding and docs * tweak * black * inits * ruff * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP * init * no checkpoint * docs * ruff * type_vocab_size * tokenizer fixes * tokenizer fixes * Update tokenization_llama.py * Update tokenization_llama.py * Update configuration_llama.py * Update modeling_llama.py * tokenizer add_bos by default * licenses * remove decoder * norms and mlp * rope overhaul * tweaks * black * mention OPT implementation * off-by-one naming * typo * fix * tokenization fix and slicing bug * padding config * cleanup * black * update tests * undo typo * fix vocab caching logic * ruff * docbuilder * attn fix from BlackSamorez * initial feedback * typo * docs * llama case * llama case * load checkpoint docs * comment about tokenizer * tokenizer defaults * clear past_key_values if use_cache=False * last tweaks * last tweaks * last tweaks * last tweaks --------- Co-authored-by:Stella Biderman <stellabiderman@gmail.com>
-
Jason Phang authored
* LLaMA * sharding and docs * tweak * black * inits * ruff * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP * init * no checkpoint * docs * ruff * type_vocab_size * tokenizer fixes * tokenizer fixes * Update tokenization_llama.py * Update tokenization_llama.py * Update configuration_llama.py * Update modeling_llama.py * tokenizer add_bos by default * licenses * remove decoder * norms and mlp * rope overhaul * tweaks * black * mention OPT implementation * off-by-one naming * typo * fix * tokenization fix and slicing bug * padding config * cleanup * black * update tests * undo typo * fix vocab caching logic * ruff * docbuilder * attn fix from BlackSamorez * initial feedback * typo * docs * llama case * llama case * load checkpoint docs * comment about tokenizer * tokenizer defaults * clear past_key_values if use_cache=False * last tweaks * last tweaks * last tweaks * last tweaks --------- Co-authored-by:Stella Biderman <stellabiderman@gmail.com>
-
Baelish03 authored
* Tranlstion Italian: migration * Update migration.mdx minor fixes * Update _toctree.yml * Delete migration.mdx * Add italian translation of migration.mdx * Update of migration.mdx translation and toctree
-
Yih-Dar authored
Update values Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Alara Dirik authored
Fix align docs typo
-
Yih-Dar authored
* Deal with torch-tensorrt --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 15 Mar, 2023 5 commits
-
-
Prathik Rao authored
* t5 remove data dependency * make style * make fix-copies --------- Co-authored-by:Prathik Rao <prathikrao@microsoft.com>
-
Anahita Bhiwandiwalla authored
* Use return_loss for BridgeTowerForContrastiveLearning, add example * fix tests * Update example in BridgeTowerForContrastiveLearning * Update test_modeling_bridgetower.py * update model output format * minor update * Update src/transformers/models/bridgetower/modeling_bridgetower.py * make style --------- Co-authored-by:
Tiep Le <97980157+tileintel@users.noreply.github.com> Co-authored-by:
Tiep Le <tiep.le@intel.com> Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Sylvain Gugger authored
* Fix regression in pipeline when device=-1 is passed * Add regression test
-
amyeroberts authored
Revert changes
-
娴簛鐨勫皬铻冭煿 authored
Fix: unfinished_sequences with correct device The original code was causing errors when running torch.jit.trace due to the tensor options being incorrect. I fixed this by using torch.ones to create a tensor with the correct device and dtype. This should resolve the issue with running torch.jit.trace.
-
- 14 Mar, 2023 13 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
Revert "Enforce same behavior as PyTorch 2.0 for older versions (#22136)" This reverts commit 1c801d65.
-
Stas Bekman authored
* [trainer] add --optim adamw_torch_fused * change optim default * deal with non-torch * revert default change; prep; add fp16/amp assert * typo * typo
-
amyeroberts authored
* Don't rescale if in and in range 0-255 * Raise value error if int values too large * Update tests/test_image_transforms.py * Update tests/test_image_transforms.py
-
Alara Dirik authored
* create MaskedImageCompletionOutput * fix bugs * fix bugs
-
Sylvain Gugger authored
* Fix big model inference for T5 models in float16 * Apply suggestions from code review Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Style * Trigger CI with latest release --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
Nicola Procopio authored
* added translated files added perf_train_cpu and perf_train_cpu_many * updated toctree
-
Yih-Dar authored
update values Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Alara Dirik authored
* Add ConvNeXt V2 to transformers * TF model is separated from the PR to fix issues
-
Yih-Dar authored
* Move `is_pipeline_test_to_skip` to specific model test classes --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Arthur authored
* temp fix * temporary fix * update * fix tests * fixup * update based on reveiew Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * update to fix tests * update docstring --------- Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
-
- 13 Mar, 2023 3 commits
-
-
MichaelRipa authored
* Updated glossary with new terms, added abbreviations for certain terms and merged autoencoding models, autoregressive models and causal language modeling into encoder and decoder models * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Added link to 'Pipeline for inference' tutorial * Trigger CI * Update docs/source/en/glossary.mdx Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> * Added entry for self supervised learning, added deleted entries + fixed broken links * Update docs/source/en/glossary.mdx Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by:
Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Yih-Dar authored
Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Patrick von Platen authored
* [Safetensors] Add explicit flag to from pretrained * add test * remove @ * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-