- 23 Jan, 2023 1 commit
-
-
Joao Gante authored
-
- 26 Oct, 2022 1 commit
-
-
Patrick von Platen authored
* add first generation tutorial * [Flax] Add subfolder functionality * [Flax] Add subfolder functionality * up * finish * delete file and re-add test
-
- 09 Sep, 2022 1 commit
-
-
Sanchit Gandhi authored
* [JAX] Replace all jax.tree_* calls with jax.tree_util.tree_* * fix double tree_util
-
- 12 Aug, 2022 1 commit
-
-
Arthur authored
* initial commit * add small test * add cross pt tf flag to test * fix quality * style * update test with new repo * fix failing test * update * fix wrong param ordering * style * update based on review * update related to recent new caching mechanism * quality * Update based on review Co-authored-by:
sgugger <sylvain.gugger@gmail.com> * quality and style * Update src/transformers/modeling_flax_utils.py Co-authored-by:
sgugger <sylvain.gugger@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 01 Aug, 2022 1 commit
-
-
Sylvain Gugger authored
* Rewrite push_to_hub to use upload_files * Adapt the doc a bit * Address review comments and clean doc
-
- 01 Jul, 2022 1 commit
-
-
Sanchit Gandhi authored
* [Flax] Add remat (gradient checkpointing) * fix variable naming in test * flip: checkpoint using a method * fix naming * fix class naming * apply PVP's suggestions from code review * make fix-copies * fix big-bird, electra, roberta * cookie-cutter * fix flax big-bird * move test to common
-
- 22 Jun, 2022 1 commit
-
-
Arthur authored
-
- 21 Jun, 2022 2 commits
-
-
Yih-Dar authored
* rename to check_pt_flax_outputs * update check_pt_flax_outputs * use 5e-5 for BigBird PT/Flax test Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
Lysandre Debut authored
* Prepare CI for v0.8.0 * pin hfh (revert before merge) * Revert "pin hfh (revert before merge)" This reverts commit a0103140e1c77b810ffcb735192968bc03be3e1f. * Test rc3 * Test latest rc * Unpin to the RC Co-authored-by:Sylvain Gugger <Sylvain.gugger@gmail.com>
-
- 19 Apr, 2022 1 commit
-
-
Suraj Patil authored
* begin do_init * add params_shape_tree * raise error if params are accessed when do_init is False * don't allow do_init=False when keys are missing * make shape tree a property * assign self._params at the end * add test for do_init * add do_init arg to all flax models * fix param setting * disbale do_init for composite models * update test * add do_init in FlaxBigBirdForMultipleChoice * better names and errors * improve test * style * add a warning when do_init=False * remove extra if * set params after _required_params * add test for from_pretrained * do_init => _do_init * chage warning to info * fix typo * add params in init_weights * add params to gpt neo init * add params to init_weights * update do_init test * Trigger CI * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * update template * trigger CI * style * style * fix template Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 29 Mar, 2022 1 commit
-
-
Yih-Dar authored
* fix - set output_attentions to True * Update tests/test_modeling_flax_common.py * update for has_attentions * overwrite check_outputs in FlaxBigBirdModelTest Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
Suraj Patil <surajp815@gmail.com>
-
- 18 Mar, 2022 1 commit
-
-
Yih-Dar authored
* Make test_equivalence_pt_to_flax more aggressive * Make test_equivalence_flax_to_pt more aggressive * don't use to_tuple * clean-up * fix missing test cases + testing on GPU * fix conversion * fix `ValueError: assignment destination is read-only` * Add type checking * commit to revert later * Fix * fix * fix device * better naming * clean-up Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 09 Feb, 2022 1 commit
-
-
Suraj Patil authored
* fix test_model_outputs_equivalence * fix tuple outputs for blenderbot
-
- 20 Dec, 2021 1 commit
-
-
Sylvain Gugger authored
* Add a main_input_name attribute to all models * Fix tests * Wtf Vs Code? * Update src/transformers/models/imagegpt/modeling_imagegpt.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Style * Fix copies Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 17 Dec, 2021 1 commit
-
-
Daniel Stancl authored
* Implement head_mask for Flax BERT and other models copied from BERT * Remove `from jax._src.nn.functions import sigmoid` Remove `from jax._src.nn.functions import sigmoid` unintentionally added by IDE * Remove no more valid copy statement * Apply patil-suraj's suggestions from code review * Apply suggestions from the code review * Update Flax template * Fix a typo * Also update template for CausalLM modules
-
- 11 Nov, 2021 2 commits
-
-
Suraj Patil authored
* fix loading flax bf16 weights in pt * fix clip test * fix t5 test * add logging statement * Update src/transformers/modeling_flax_pytorch_utils.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * switch back to native any * fix check for bf16 weights Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Suraj Patil authored
* fix inits * fix embed dtype * fix embed dtype * add test to check default dtype * quality * add type conversion methods for flax models * more robust casting * cast sinusoidal positions * update pegasus * update albert * update test * make sure dtype is passed to every module * style * fix electra dense * fix t5 * quality * add more tests * better name * use the dtype for lm head computation * fix albert * style * fix albert embed dtype * more tests * fix vision enc-dec * cleanup * fix embed dtype pegasus * fix default param test * doc * update template * fix final_logits_bias dtype * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * fix doc * fix doc * add detailed docstring for dtype parameter * remove un-necessary import Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 02 Nov, 2021 1 commit
-
-
Sylvain Gugger authored
* Update Transformers to huggingface_hub >= 0.1.0 * Forgot to save... * Style * Fix test
-
- 21 Oct, 2021 1 commit
-
-
Li-Huai (Allan) Lin authored
* Fix * Style * Name * Fix tests * Style * Remove embed sizes checking * Disable some tests * Fix * Apply suggestion
-
- 12 Aug, 2021 1 commit
-
-
Patrick von Platen authored
* up * up * up
-
- 05 Aug, 2021 1 commit
-
-
Patrick von Platen authored
* finish PR * add tests * correct tests * finish * correct other flax tests * better naming * correct naming * finish * apply sylvains suggestions
-
- 04 Aug, 2021 1 commit
-
-
Patrick von Platen authored
* [Flax] Align device name in docs * make style * fix import error
-
- 13 Jul, 2021 1 commit
-
-
Sylvain Gugger authored
* Add option to load a pretrained model with mismatched shapes * Fail at loading when mismatched shapes in Flax * Fix tests * Update src/transformers/modeling_flax_utils.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Address review comments Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 23 Jun, 2021 1 commit
-
-
Sylvain Gugger authored
* Clean push to hub API * Create working dir if it does not exist * Different tweak * New API + all models + test Flax * Adds the Trainer clean up * Update src/transformers/file_utils.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Address review comments * (nit) output types * No need to set clone_from when folder exists * Update src/transformers/trainer.py Co-authored-by:
Julien Chaumond <julien@huggingface.co> * Add generated_from_trainer tag * Update to new version * Fixes Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Julien Chaumond <julien@huggingface.co> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
- 21 Jun, 2021 2 commits
-
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * fix flax save pretrained test
-
Suraj Patil authored
* boom boom * remove flax clip example * allow loading head model with base model weights * add test * fix imports * disable save, load test for clip * add test_save_load_to_base
-
- 14 Jun, 2021 3 commits
-
-
Vasudev Gupta authored
* add flax bert * bert -> bigbird * original_full ported * add debugger * init block sparse * fix copies ; gelu_fast -> gelu_new * block sparse port * fix block sparse * block sparse working * all ckpts working * fix-copies * make quality * init tests * temporary fix for FlaxBigBirdForMultipleChoice * skip test_attention_outputs * fix * gelu_fast -> gelu_new ; fix multiple choice model * remove nsp * fix sequence classifier * fix * make quality * make fix-copies * finish * Delete debugger.ipynb * Update src/transformers/models/big_bird/modeling_flax_big_bird.py * make style * finish * bye bye jit flax tests Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * upload
-
Daniel Stancl authored
* Start working on FlaxBart * Create modeling_flax_bart.py * Write FlaxBartAttention * Add FlaxBartEncoderLayer * Add FlaxBartDecoderLayer and some typing * Add helepr function for FlaxBart * shift_tokens_right * _make_causal_mask * _expand_mask * Add PositionalEmbedding and fix init_std naming * Add FlaxBartPretrainedModel * Add FlaxBartEncoder * Add FlaxBartEncoder * Add FlaxBartEncoder among modules to be imported * YET WE CANNOT INITIALIZE THAT!! :( * Make BartEncoder working Change BartEncoder to instance of nn.Module so far * Add FlaxBartDecoder * Add FlaxBartModel * TODO to make model run -> Prepapre model inputs * Resolve padding * Add FlaxBartModel * Add FlaxBartModel into importable modules * Remove FlaxBartEncoder and FlaxBartDecoder from importable modules * make style; not properly working * make style; make quality not pass due to some import I left * Remove TODO for padding_idx in nn.Embed so far * Add FlaxBartForConditionalGeneration * Incorporate Flax model output classes, i.e. return_dict * Add another models and incorporate use_cache arg * Add FlaxBartForSequenceClassification and FlaxBartForQuestionAnswering * Incorporate use_cache arg from PyTorch implementation * Add all necessary Flax output utils * Add FlaxBartForCausalLM; not working yet' * Add minor improvements; still lacks some functionality * Update docs, src and tests * Add support of FlaxBart to docs/source * Fix some bugs in FlaxBart souce code * Add some neccessary tests for FlaxBart models - jit_compilation not passing * Fix tests and add test_head_masking * Fix tests for @jax.jit computation * Add test_head_masking * Migrate FlaxBart tests from jax.numpy to numpy * Remove FlaxBartForCausalLM * Clean repo * fix bart model weight structure * Fix FlaxBartForSequenceClassification Slicing is not possible to use below jit, therefore, selecting sentence representation from hidden_states must be changed. * Allow FlaxBartForSequenceClassification for testing pt_flax equivalence * Allow testing for FlaxBartForQA for pt_flax equivalence * Add a comment to FlaxBartForSequenceClassification + change noise from 1e-3 to 1e-6 * remove past_key_values * remove inputs_mebeds and make input_ids required * add position ids * re-write attention layer * fix dataclass * fix pos embeds and attention output * fix pos embeds * expose encode method * expose decode method * move docstring to top * add cache for causal attn layer * remove head masking for now * s2s greedy search first pass * boom boom * fix typos * fix greedy generate for bart * use encoder, decoder layers instead of num_hidden_layers * handle encoder_outputs * cleanup * simplify decoding * more clean-up * typos * Change header + add {decoder_,}position_ids into 2 models * add BartConfig * fix existing tests * add encode, decode methods * Fix shift_tokens_right for JIT compilation + clarify one condition * fix decode * encoder => encode * simplify generate * add tests for encode and decode * style * add tests for cache * fix equivalence tests * sample generate now works with seq2seq * generation tests * initialize dense layers * docstring and cleanup * quality * remove get/set input_embeddings * address Patricks suggestions * decode for every model, remove encoder_outputs from call * update tests accordingly * decode returns only decoder outputs and logits * fix arguments * doc encode, decode methods * correct base_model_prefix * fix test for seq classif model * fix docs Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Suraj Patil <surajp815@gmail.com>
-
- 01 Jun, 2021 1 commit
-
-
Suraj Patil authored
* add flax CLIP * default input_shape * add tests * fix test * fix name * fix docs * fix shapes * attend at least 1 token * flax conv to torch conv * return floats * fix equivalence tests * fix import * return attention_weights and update tests * fix dosctrings * address patricks comments * input_shape arg * add tests for get_image_features and get_text_features methods * fix tests
-
- 28 May, 2021 1 commit
-
-
Jayendra authored
* Added logic to return attention from flax-bert model and added test cases to check that * Added new line at the end of file to test_modeling_flax_common.py * fixing code style * Fixing Roberta and Elextra models too from cpoying bert * Added temporary hack to not run test_attention_outputs for FlaxGPT2 * Returning attention weights from GPT2 and changed the tests accordingly. * last fixes * bump flax dependency Co-authored-by:
jayendra <jayendra@infocusp.in> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
- 26 May, 2021 1 commit
-
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * change dataclasses to flax ones * fix typo * fix jitted tests * fix bert & electra
-
- 18 May, 2021 1 commit
-
-
Suraj Patil authored
* flax gpt2 * combine masks * handle shared embeds * add causal LM sample * style * add tests * style * fix imports, docs, quality * don't use cache * add cache * add cache 1st version * make use cache work * start adding test for generation * finish generation loop compilation * rewrite test * finish * update * update * apply sylvains suggestions * update * refactor * fix typo Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
- 04 May, 2021 1 commit
-
-
Patrick von Platen authored
* add flax roberta * make style * correct initialiazation * modify model to save weights * fix copied from * fix copied from * correct some more code * add more roberta models * Apply suggestions from code review * merge from master * finish * finish docs Co-authored-by:Patrick von Platen <patrick@huggingface.co>
-
- 29 Apr, 2021 1 commit
-
-
Patrick von Platen authored
* add attentions & hidden states * add model outputs + docs * finish docs * finish tests * finish impl * del @ * finish * finish * correct test * apply sylvains suggestions * Update src/transformers/models/bert/modeling_flax_bert.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * simplify more Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 23 Apr, 2021 1 commit
-
-
Patrick von Platen authored
* improve flax * refactor * typos * Update src/transformers/modeling_flax_utils.py * Apply suggestions from code review * Update src/transformers/modeling_flax_utils.py * fix typo * improve error tolerance * typo * correct nasty saving bug * fix from pretrained * correct tree map * add note * correct weight tying
-
- 31 Mar, 2021 1 commit
-
-
Patrick von Platen authored
* add first code structures * add all bert models * add to init and docs * correct docs * make style
-
- 30 Mar, 2021 1 commit
-
-
Patrick von Platen authored
* save intermediate * finish first version * delete some more * improve import * fix roberta * Update src/transformers/modeling_flax_pytorch_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_flax_pytorch_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * small corrections * apply all comments * fix deterministic * make fix-copies Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 18 Mar, 2021 1 commit
-
-
Patrick von Platen authored
* Create modeling_flax_eletra with code copied from modeling_flax_bert * Add ElectraForMaskedLM and ElectraForPretraining * Add modeling test for Flax electra and fix naming and arg in Flax Electra model * Add documentation * Fix code style * Create modeling_flax_eletra with code copied from modeling_flax_bert * Add ElectraForMaskedLM and ElectraForPretraining * Add modeling test for Flax electra and fix naming and arg in Flax Electra model * Add documentation * Fix code style * Fix code quality * Adjust tol in assert_almost_equal due to very small difference between model output, ranging 0.0010 - 0.0016 * Remove redundant ElectraPooler * save intermediate * adapt * correct bert flax design * adapt roberta as well * finish roberta flax * finish * apply suggestions * apply suggestions Co-authored-by:Chris Nguyen <anhtu2687@gmail.com>
-
- 16 Mar, 2021 1 commit
-
-
Patrick von Platen authored
* make flax tests pytorch independent * fix typo * finish * improve circle ci * fix return tensors * correct flax test * re-add sentencepiece * last tokenizer fixes * finish maybe now
-