- 01 May, 2023 5 commits
-
-
Zachary Mueller authored
* Depricate xpu_backend for ddp_backend * Typo * Only do a minor deprecation, no need for major Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
IMvision12 authored
fix
-
Ashwin Mathur authored
* added BioGptForSequenceClassification * added source of copied code * typo * Format code with black * Update comments for copied code * Remove code copy comment * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fix failing tests * Update code copied from comments * Fix code quality * Update src/transformers/models/biogpt/modeling_biogpt.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Fix lint error * Update src/transformers/models/biogpt/modeling_biogpt.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Rename model to biogpt for consistency * Add PipelineTesterMixin to test_modeling_biogpt.py * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Resolve merge confict --------- Co-authored-by:
Guillem Garc铆a Subies <37592763+GuillemGSubies@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Xin Wen authored
-
Stephen Kaplan authored
Fix minor grammar issue
-
- 29 Apr, 2023 1 commit
-
-
Joao Gante authored
-
- 28 Apr, 2023 5 commits
-
-
Yih-Dar authored
* fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Younes Belkada authored
* remove labels masking * add fix on blip tf
-
s-JoL authored
* update Open-Llama model * update * update format * update doc * update * update stable embedding test * update test case * update format * update readme * fix typo * update name * remove tokenizer and update format * remove convert_open_llama_weights_to_hf * update warning and doc_string --------- Co-authored-by:songliang.bayesian <songliang.bayesian@bytedance.com>
-
Shivam Shrirao authored
Cuda rng_state_all is used when saving in distributed mode so same should also be used when loading (#23045) cuda rng state should be all for distributed bc all were saved
-
Maxime M茅loux authored
* Add Trainer support for ReduceLROnPlateau Fixes #16503 * Remove training argument and add default instance --------- Co-authored-by:mmeloux <maxime.meloux@loria.fr>
-
- 27 Apr, 2023 6 commits
-
-
Bartosz Szmelczynski authored
* switch np.random.permutation to jax.random.permuation * remove comments * remove leftover comment * skip similarity tests * modify indices_prng_key usage, add deterministic behaviour * update style * remove unused import * remove copy statement since classes are not identical * remove numpy import * revert removing copied from statements * make style from copied * remove copied from statement * update copied from statement to include only np.ndarry * add deterministic args, unittestskip equivalence tests
-
peter-sk authored
* added GPTNeoForTokenClassification * add to top-level init * fixup * test * more fixup * add to gpt_neo.mdx * repo consistency * dummy copy * fix copies * optax >= 0.1.5 assumes jax.Array exists - which it doesn't for jax <= 0.3.6 * merge with main made this superfluous * added classifier_dropout * remove legacy code * removed fmt:on/off removed expected_outputs * doc style fix * classifier_dropout is always in config --------- Co-authored-by:Prof. Peter Schneider-Kamp <jps@ordbogen.com>
-
peter-sk authored
* initial commit * added GPTNeoXForTokenClassification * typo * doc fixed extra comma that turned into a tuple * unifying variable names fixing forward call * classifier_dropout is in config Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by:
Prof. Peter Schneider-Kamp <jps@ordbogen.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Arthur authored
* add fast not use warning * properly check sequence_length vs chunk_size * fixup
-
Younes Belkada authored
fix pix2struct doctest
-
fxmarty authored
* fix mess * better documentation * typo * fix doc * update * add test * fix test * more tests * Update src/transformers/modeling_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * move to utils * Apply suggestions from code review Co-authored-by:
Michael Benayoun <mickbenayoun@gmail.com> * nit --------- Co-authored-by:
younesbelkada <younesbelkada@gmail.com> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Michael Benayoun <mickbenayoun@gmail.com>
-
- 26 Apr, 2023 9 commits
-
-
Sylvain Gugger authored
Use default ignore index in Luke
-
Zachary Mueller authored
* Bring back deepspeed integration * Branchname * Self-scheduled * newline * Use deepspeed env var * Remove comment * Del env var after partialstate
-
Sylvain Gugger authored
-
Arthur authored
* update template processing for llama fast to add eos * style * update * adress training from new issue * fix * update * special tokens can be given even if not used
-
Younes Belkada authored
* add hack fx * continue hacking * final changes * Test * Add a keys method * Fix keys method * revert unneeded changes * small nit --------- Co-authored-by:Michael Benayoun <mickbenayoun@gmail.com>
-
Younes Belkada authored
* multiple fixes - add `add_special_tokens` to `True` by default - remove label smoothing and labels masking * fix test
-
Javier de la Rosa authored
* Add gradient checkpointing to Whisper Flax * self.gradient_checkpointing only needed in nn.Module, removing unnecessary comments
-
Yih-Dar authored
fix Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Ritik Nandwal authored
* Add initial changes for TF wav2vec2 for sequence classification * Add suggested changes * Add serving and serving output methods * Add serving_output implementation and fix layer_weights * Add fixes * Fixed test cases * Fixing test and adding suggested changes
-
- 25 Apr, 2023 4 commits
-
-
Younes Belkada authored
fix pipeline issue
-
Lingepumpe authored
* Avoid invalid escape sequences, use raw strings * Integrate PR feedback
-
AleksanderWWW authored
* [neptune] fix checkpoint bug with relative out_dir * update imports * reformat with black * check neptune without imports * fix typing-related issue * run black on code * use os.path.sep instead of raw \ * simplify imports and remove type annotation * make ruff happy * apply review suggestions * replace run with with_id kwarg to run * update imports to avoid deprecation warnings for the latest client --------- Co-authored-by:kshitij12345 <kshitijkalambarkar@gmail.com>
-
Younes Belkada authored
* add sam doc * fixes * multiple fixes
-
- 24 Apr, 2023 9 commits
-
-
Joao Gante authored
* temperature controls speed
-
amyeroberts authored
* Update feature selection * Check compatibility with datasets version * Checkout from datasets main
-
othertea authored
-
Nicolas Patry authored
* Fixed the revert by making sure that even the regexp can cover all duplicates. * Code simplification using hash. * Fixing the `ident`. * Fixing ignoring patterened duplicate names. * Using `accelerate@find_tied_parameters` for from_pretrained This is more correct there, since it handles meta device seemlessly and we don't need to handle "non-duplicate" tensors (slices of each other). * Protecting accelerate. * Update src/transformers/modeling_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> --------- Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Lucain authored
* Test hf_hub 0.14.0rc1 * fix mocked tests * package version --------- Co-authored-by:
Sylvain Gugger <Sylvain.gugger@gmail.com> Co-authored-by:
testbot <lucainp@hf.co>
-
hanrui1sensetime authored
fix transformers keys
-
Connor Boyle authored
* Raise error if `stride` is too high * Clarify use of `stride`
-
fxmarty authored
Add an attribute to disable custom kernels in deformable detr in order to make the model ONNX exportable (#22918) * add disable kernel option * add comment * fix copies * add disable_custom_kernels to config * Update src/transformers/models/deta/modeling_deta.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/deta/modeling_deta.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/deta/modeling_deta.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * style * fix --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Joao Gante authored
-
- 23 Apr, 2023 1 commit
-
-
NielsRogge authored
Adds FocalNet by Microsoft to transformers --------- Co-authored-by:
Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local> Co-authored-by:
alaradirik <alaradirik@gmail.com>
-