- 20 Mar, 2023 4 commits
-
-
heya5 authored
Update training_args.py
-
Nicola Procopio authored
* added translated files added perf_train_cpu and perf_train_cpu_many * updated toctree * updated toctree * added file perf_infer_cpu.medx * italian translation perf_infer_cpu.mdx
-
yesinkim authored
[Docs] fix typos Co-authored-by:yesinkim <yesinkim@yesinkimui-MacBookAir.local>
-
Pasquale Minervini authored
Update training_args.py A nightly install is not required anymore for `torch.compile`.
-
- 17 Mar, 2023 13 commits
-
-
Stas Bekman authored
[trainer] param count for zero3
-
Guangyuan Ma authored
push
-
Ali Hassani authored
* Add kernel size to NATTEN's QK arguments. The new NATTEN 0.14.5 supports PyTorch 2.0, but also adds an additional argument to the QK operation to allow optional RPBs. This ends up failing NATTEN tests. This commit adds NATTEN back to circleci and adds the arguments to get it working again. * Force NATTEN >= 0.14.5
-
Seb0 authored
fix(docs): task guide links in model docs
-
Maria Khalusova authored
removed .mdx extension
-
lewtun authored
* Add LlamaForSequenceClassification * Update src/transformers/models/llama/modeling_llama.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/llama/modeling_llama.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Add docstring * Add test * Add input embedding getter and setter * Remove dead code --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
Wang, Yi authored
* fix AutoTP in deepspeed could not work for bloom Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> * add a method in BloomModel to build ailib Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com>
-
Sylvain Gugger authored
* LLaMA house-keeping * Doc links
-
Maria Khalusova authored
* added doc to toc, auto tip with supported models, mention of task guide in model docs * make style * removed "see also" * minor fix
-
Yih-Dar authored
Use dash 2.8.1 for now Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
wangpeng authored
Co-authored-by:yue kun <yuekun.wp@alibaba-inc.com>
-
Kevin Turner authored
-
- 16 Mar, 2023 12 commits
-
-
Yih-Dar authored
fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
* py38 + torch 2 * increment cache versions --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Susnato Dhar authored
* fixes a typo * .
-
Younes Belkada authored
* add `accelerate` support for XGLM * fix order
-
SatyaJandhyalaAtMS authored
* Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143 * Reduced column width * Fix formatting. * Revert "Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143" This reverts commit 6e95a108042118d204da447729f3834affa354fc. * Fix export error. * Revert "Fix formatting." This reverts commit 8310f60da10358edbdf77a2a2f3c83ee55066cb8. * Propagated changes made in SwinV2 to Swin2SR
-
Yih-Dar authored
* Update UNCONVERTIBLE_MODEL_ARCHITECTURES * Deal with 2 model tester classes in single test file * Deal with 2 model tester classes in single test file * Deal with 2 model tester classes in single test file * make style and quality --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Jason Phang authored
* LLaMA * sharding and docs * tweak * black * inits * ruff * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP * init * no checkpoint * docs * ruff * type_vocab_size * tokenizer fixes * tokenizer fixes * Update tokenization_llama.py * Update tokenization_llama.py * Update configuration_llama.py * Update modeling_llama.py * tokenizer add_bos by default * licenses * remove decoder * norms and mlp * rope overhaul * tweaks * black * mention OPT implementation * off-by-one naming * typo * fix * tokenization fix and slicing bug * padding config * cleanup * black * update tests * undo typo * fix vocab caching logic * ruff * docbuilder * attn fix from BlackSamorez * initial feedback * typo * docs * llama case * llama case * load checkpoint docs * comment about tokenizer * tokenizer defaults * clear past_key_values if use_cache=False * last tweaks * last tweaks * last tweaks * last tweaks --------- Co-authored-by:Stella Biderman <stellabiderman@gmail.com>
-
Jason Phang authored
* LLaMA * sharding and docs * tweak * black * inits * ruff * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP * init * no checkpoint * docs * ruff * type_vocab_size * tokenizer fixes * tokenizer fixes * Update tokenization_llama.py * Update tokenization_llama.py * Update configuration_llama.py * Update modeling_llama.py * tokenizer add_bos by default * licenses * remove decoder * norms and mlp * rope overhaul * tweaks * black * mention OPT implementation * off-by-one naming * typo * fix * tokenization fix and slicing bug * padding config * cleanup * black * update tests * undo typo * fix vocab caching logic * ruff * docbuilder * attn fix from BlackSamorez * initial feedback * typo * docs * llama case * llama case * load checkpoint docs * comment about tokenizer * tokenizer defaults * clear past_key_values if use_cache=False * last tweaks * last tweaks * last tweaks * last tweaks --------- Co-authored-by:Stella Biderman <stellabiderman@gmail.com>
-
Baelish03 authored
* Tranlstion Italian: migration * Update migration.mdx minor fixes * Update _toctree.yml * Delete migration.mdx * Add italian translation of migration.mdx * Update of migration.mdx translation and toctree
-
Yih-Dar authored
Update values Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Alara Dirik authored
Fix align docs typo
-
Yih-Dar authored
* Deal with torch-tensorrt --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 15 Mar, 2023 5 commits
-
-
Prathik Rao authored
* t5 remove data dependency * make style * make fix-copies --------- Co-authored-by:Prathik Rao <prathikrao@microsoft.com>
-
Anahita Bhiwandiwalla authored
* Use return_loss for BridgeTowerForContrastiveLearning, add example * fix tests * Update example in BridgeTowerForContrastiveLearning * Update test_modeling_bridgetower.py * update model output format * minor update * Update src/transformers/models/bridgetower/modeling_bridgetower.py * make style --------- Co-authored-by:
Tiep Le <97980157+tileintel@users.noreply.github.com> Co-authored-by:
Tiep Le <tiep.le@intel.com> Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com>
-
Sylvain Gugger authored
* Fix regression in pipeline when device=-1 is passed * Add regression test
-
amyeroberts authored
Revert changes
-
娴簛鐨勫皬铻冭煿 authored
Fix: unfinished_sequences with correct device The original code was causing errors when running torch.jit.trace due to the tensor options being incorrect. I fixed this by using torch.ones to create a tensor with the correct device and dtype. This should resolve the issue with running torch.jit.trace.
-
- 14 Mar, 2023 6 commits
-
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
Revert "Enforce same behavior as PyTorch 2.0 for older versions (#22136)" This reverts commit 1c801d65.
-
Stas Bekman authored
* [trainer] add --optim adamw_torch_fused * change optim default * deal with non-torch * revert default change; prep; add fp16/amp assert * typo * typo
-
amyeroberts authored
* Don't rescale if in and in range 0-255 * Raise value error if int values too large * Update tests/test_image_transforms.py * Update tests/test_image_transforms.py
-