"vscode:/vscode.git/clone" did not exist on "f8a922e96630a213f49eb75d39635d646981cc8a"
- 21 Mar, 2023 8 commits
-
-
Ali Hassani authored
-
Yanming W authored
-
Yih-Dar authored
* time to say goodbye, torch 1.7 and 1.8 * clean up torch_int_div * clean up is_torch_less_than_1_8-9 * update --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Davide Gazz猫 authored
Add translation
-
Yih-Dar authored
* fix more doctests * fix style --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
* all doctests * Skip failed tests --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Gerald Cuder authored
* Make sure CVT can be trained using mixed precision * Add test for keras-fit with mixed-precision * Update tests/models/cvt/test_modeling_tf_cvt.py Co-authored-by:
Matt <Rocketknight1@users.noreply.github.com> --------- Co-authored-by:
gcuder <Gerald.Cuder@iacapps.com> Co-authored-by:
Matt <Rocketknight1@users.noreply.github.com>
-
Andrei Panferov authored
* Fixed modules_to_not_convert default value * Fixed modules_to_not_convert docstring * Update src/transformers/utils/bitsandbytes.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/utils/bitsandbytes.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * ["lm_head"] if modules_to_not_convert is None --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 20 Mar, 2023 12 commits
-
-
amyeroberts authored
* Add bool_masked_pos to forward docstrings * Add note about mask ratio - videomae * Fix up * Fix indenting
-
Maria Khalusova authored
* added an example of pad_to_multiple_of * make style * addressed feedback
-
Antoni Viros authored
Move torch.compile() wrapping after DDP/FSDP wrapping to ensure correct graph breaks during training (#22279)
-
amyeroberts authored
-
Sylvain Gugger authored
* Proper map location for optimizer load * What happened to my code?
-
Sylvain Gugger authored
* Update LLaMA conversion script * Doc * Fix the weight size for the 13B checkpoint * Update src/transformers/models/llama/convert_llama_weights_to_hf.py Co-authored-by:
Lysandre Debut <lysandre.debut@reseau.eseo.fr> --------- Co-authored-by:
Lysandre Debut <lysandre.debut@reseau.eseo.fr>
-
Sylvain Gugger authored
-
yqy2001 authored
fix grad ckpt bug of llama
-
heya5 authored
Update training_args.py
-
Nicola Procopio authored
* added translated files added perf_train_cpu and perf_train_cpu_many * updated toctree * updated toctree * added file perf_infer_cpu.medx * italian translation perf_infer_cpu.mdx
-
yesinkim authored
[Docs] fix typos Co-authored-by:yesinkim <yesinkim@yesinkimui-MacBookAir.local>
-
Pasquale Minervini authored
Update training_args.py A nightly install is not required anymore for `torch.compile`.
-
- 17 Mar, 2023 13 commits
-
-
Stas Bekman authored
[trainer] param count for zero3
-
Guangyuan Ma authored
push
-
Ali Hassani authored
* Add kernel size to NATTEN's QK arguments. The new NATTEN 0.14.5 supports PyTorch 2.0, but also adds an additional argument to the QK operation to allow optional RPBs. This ends up failing NATTEN tests. This commit adds NATTEN back to circleci and adds the arguments to get it working again. * Force NATTEN >= 0.14.5
-
Seb0 authored
fix(docs): task guide links in model docs
-
Maria Khalusova authored
removed .mdx extension
-
lewtun authored
* Add LlamaForSequenceClassification * Update src/transformers/models/llama/modeling_llama.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/models/llama/modeling_llama.py Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Add docstring * Add test * Add input embedding getter and setter * Remove dead code --------- Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
-
Wang, Yi authored
* fix AutoTP in deepspeed could not work for bloom Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> * add a method in BloomModel to build ailib Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com> --------- Signed-off-by:
Wang, Yi A <yi.a.wang@intel.com>
-
Sylvain Gugger authored
* LLaMA house-keeping * Doc links
-
Maria Khalusova authored
* added doc to toc, auto tip with supported models, mention of task guide in model docs * make style * removed "see also" * minor fix
-
Yih-Dar authored
Use dash 2.8.1 for now Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
wangpeng authored
Co-authored-by:yue kun <yuekun.wp@alibaba-inc.com>
-
Kevin Turner authored
-
- 16 Mar, 2023 7 commits
-
-
Yih-Dar authored
fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Yih-Dar authored
* py38 + torch 2 * increment cache versions --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Susnato Dhar authored
* fixes a typo * .
-
Younes Belkada authored
* add `accelerate` support for XGLM * fix order
-
SatyaJandhyalaAtMS authored
* Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143 * Reduced column width * Fix formatting. * Revert "Temporarily fix https://github.com/microsoft/onnx-converters-private/issues/143" This reverts commit 6e95a108042118d204da447729f3834affa354fc. * Fix export error. * Revert "Fix formatting." This reverts commit 8310f60da10358edbdf77a2a2f3c83ee55066cb8. * Propagated changes made in SwinV2 to Swin2SR
-
Yih-Dar authored
* Update UNCONVERTIBLE_MODEL_ARCHITECTURES * Deal with 2 model tester classes in single test file * Deal with 2 model tester classes in single test file * Deal with 2 model tester classes in single test file * make style and quality --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Jason Phang authored
* LLaMA * sharding and docs * tweak * black * inits * ruff * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP * init * no checkpoint * docs * ruff * type_vocab_size * tokenizer fixes * tokenizer fixes * Update tokenization_llama.py * Update tokenization_llama.py * Update configuration_llama.py * Update modeling_llama.py * tokenizer add_bos by default * licenses * remove decoder * norms and mlp * rope overhaul * tweaks * black * mention OPT implementation * off-by-one naming * typo * fix * tokenization fix and slicing bug * padding config * cleanup * black * update tests * undo typo * fix vocab caching logic * ruff * docbuilder * attn fix from BlackSamorez * initial feedback * typo * docs * llama case * llama case * load checkpoint docs * comment about tokenizer * tokenizer defaults * clear past_key_values if use_cache=False * last tweaks * last tweaks * last tweaks * last tweaks --------- Co-authored-by:Stella Biderman <stellabiderman@gmail.com>
-