- 29 Jun, 2021 2 commits
-
-
NielsRogge authored
* Rename target to labels in DetrFeatureExtractor * Update DetrFeatureExtractor tests accordingly * Improve docs of DetrFeatureExtractor * Improve docs * Make style
-
Stas Bekman authored
* [models] respect dtype of the model when instantiating it * cleanup * cleanup * rework to handle non-float dtype * fix * switch to fp32 tiny model * improve * use dtype.is_floating_point * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix the doc * recode to use explicit torch_dtype_auto_detect, torch_dtype args * docs and tweaks * docs and tweaks * docs and tweaks * merge 2 args, add docs * fix * fix * better doc * better doc Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 28 Jun, 2021 1 commit
-
-
Bhadresh Savani authored
* added cotext manager to datasets map * fixed style and spaces * fixed warning of deprecation * changed desc
-
- 25 Jun, 2021 1 commit
-
-
Stas Bekman authored
-
- 24 Jun, 2021 1 commit
-
-
Lysandre Debut authored
* Fix torchscript tests * Better test * Remove bogus print
-
- 23 Jun, 2021 6 commits
-
-
Michael Benayoun authored
Co-authored-by:Michael Benayoun <michael@huggingface.co>
-
Lysandre authored
-
Sylvain Gugger authored
* Clean push to hub API * Create working dir if it does not exist * Different tweak * New API + all models + test Flax * Adds the Trainer clean up * Update src/transformers/file_utils.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Address review comments * (nit) output types * No need to set clone_from when folder exists * Update src/transformers/trainer.py Co-authored-by:
Julien Chaumond <julien@huggingface.co> * Add generated_from_trainer tag * Update to new version * Fixes Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Julien Chaumond <julien@huggingface.co> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
Vasudev Gupta authored
* copy pytorch-t5 * init * boom boom * forward pass same * make generation work * add more tests * make test work * finish normal tests * make fix-copies * finish quality * correct slow example * correct slow test * version table * upload models * Update tests/test_modeling_flax_t5.py * correct incorrectly deleted line Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Patrick von Platen <patrick@huggingface.co>
-
Daniel Stancl authored
* Add output args to greedy search * Fix critical typo + make style quality * Handle generate_beam_search * Add dict_specific tests and fix the placement of encoder outputs * Add specific outputs * Update doc * Fix typo * Adjust handling encoder_outputs + Fix generating for T5 * Fix generate for RAG * Fix handling ouptut_attentions when target_mapping is not None Take care of situations when target_mapping is provided as there are 2-tuple of attentions Change from: if inputs["output_attentions"]: attentions = tuple(tf.transpose(t, perm(2, 3, 0, 1)) for t in attentions) to: if inputs["output_attentions"]: if inputs["target_mapping"] is not None: # when target_mapping is provided, there are 2-tuple of attentions attentions = tuple( tuple(tf.transpose(attn_stream, perm=(2, 3, 0, 1)) for attn_stream in t) for t in attentions ) else: attentions = tuple(tf.transpose(t, perm=(2, 3, 0, 1)) for t in attentions) * Rename kwargs to model_kwargs * make style quality * Move imports in test_modeling_tf_common.py Move ModelOutput-related imports in test_modeling_tf_common.py into the `is_tf_available():` statement. * Rewrite nested if-statements * Fix added tests -
Nicolas Patry authored
* Optimizing away the `fill-mask` pipeline. - Don't send anything to the tokenizer unless needed. Vocab check is much faster - Keep BC by sending data to the tokenizer when needed. User handling warning messages will see performance benefits again - Make `targets` and `top_k` work together better `top_k` cannot be higher than `len(targets)` but can be smaller still. - Actually simplify the `target_ids` in case of duplicate (it can happen because we're parsing raw strings) - Removed useless code to fail on empty strings. It works only if empty string is in first position, moved to ignoring them instead. - Changed the related tests as only the tests would fail correctly (having incorrect value in first position) * Make tests compatible for 2 different vocabs... (at the price of a warning). Co-authored-by: @EtaoinWu * ValueError working globally * Update src/transformers/pipelines/fill_mask.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * `tokenizer.vocab` -> `tokenizer.get_vocab()` for more compatiblity + fallback. Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 22 Jun, 2021 3 commits
-
-
Stas Bekman authored
* bug fixes and a rename * add extended DDP test
-
Stas Bekman authored
* [tests] multiple improvements * cleanup * style * todo to investigate * fix
-
Stas Bekman authored
* set log level from CLI * add log_level_replica + test + extended docs * cleanup * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * rename datasets objects to allow datasets module * improve the doc * style * doc improve Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 21 Jun, 2021 4 commits
-
-
Stas Bekman authored
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * fix flax save pretrained test
-
Suraj Patil authored
* boom boom * remove flax clip example * allow loading head model with base model weights * add test * fix imports * disable save, load test for clip * add test_save_load_to_base
-
Suraj Patil authored
* boom boom * remove flax clip example * fix from_save_pretrained
-
- 17 Jun, 2021 2 commits
-
-
Sylvain Gugger authored
* AutoTokenizer: infer the class from the tokenizer config if possible * Add tests * Update src/transformers/models/auto/tokenization_auto.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Lysandre Debut authored
-
- 16 Jun, 2021 2 commits
-
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * add hubert * add first test file * more docs * fix bugs * fix bug * finish * finish * finish docstring * fix * fix * finalize * add to ignored * finish * Apply suggestions from code review * correct naming * finish * fix auto config * finish * correct convert script * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Suraj Patil <surajp815@gmail.com> * apply suggestions lysandre & suraj Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Suraj Patil <surajp815@gmail.com>
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * push new logit processors * add processors * save first working version * save intermediate * finish * make style * make fix-copies * finish * Update tests/test_modeling_flax_bart.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
Suraj Patil <surajp815@gmail.com> Co-authored-by:
Patrick von Platen <patrick@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Suraj Patil <surajp815@gmail.com>
-
- 15 Jun, 2021 2 commits
-
-
Stas Bekman authored
* ensure concurrent pytest workers use a unique port for torch.distributed.launch * reword
-
Amog Kamsetty authored
* fix * fixes * add back to scheduled tests * formatting * Update integrations.py
-
- 14 Jun, 2021 8 commits
-
-
Stas Bekman authored
* consistent nn. and nn.functional: p3 templates * restore
-
Vasudev Gupta authored
* add flax bert * bert -> bigbird * original_full ported * add debugger * init block sparse * fix copies ; gelu_fast -> gelu_new * block sparse port * fix block sparse * block sparse working * all ckpts working * fix-copies * make quality * init tests * temporary fix for FlaxBigBirdForMultipleChoice * skip test_attention_outputs * fix * gelu_fast -> gelu_new ; fix multiple choice model * remove nsp * fix sequence classifier * fix * make quality * make fix-copies * finish * Delete debugger.ipynb * Update src/transformers/models/big_bird/modeling_flax_big_bird.py * make style * finish * bye bye jit flax tests Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com>
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * upload
-
Will Rice authored
* [WIP] Add TFWav2Vec2Model Work in progress for adding a tensorflow version of Wav2Vec2 * feedback changes * small fix * Test Feedback Round 1 * Add SpecAugment and CTC Loss * correct spec augment mask creation * docstring and correct copyright * correct bugs * remove bogus file * finish tests correction * del unnecessary layers * Update src/transformers/models/wav2vec2/modeling_tf_wav2vec2.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * make style * correct final bug * Feedback Changes Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Stas Bekman authored
* implement AdafactorSchedule * typo * fix * Update src/transformers/optimization.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
SaulLu authored
* feature for tokenizer without slow/legacy version * format * modify common test * add tests * add PreTrainedTokenizerFast to AutoTokenizer * format * change tokenizer common test in order to be able to run test without a slow version * update tokenizer fast test in order to use `rust_tokenizer_class` attribute instead of `tokenizer_class` * add autokenizer test * replace `if self.tokenizer_class is not None` with ` if self.tokenizer_class is None` * remove obsolete change in comment * Update src/transformers/tokenization_utils_base.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * Update src/transformers/tokenization_utils_fast.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * change `get_main_tokenizer` into `get_tokenizers` * clarify `get_tokenizers` method * homogenize with `test_slow_tokenizer` and `test_rust_tokenizer` * add `test_rust_tokenizer = False` to tokenizer which don't define a fast version * `test_rust_tokenizer = False` for BertJapaneseTokenizer * `test_rust_tokenizer = False` for BertJapaneseCharacterTokenizationTest Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Daniel Stancl authored
* Start working on FlaxBart * Create modeling_flax_bart.py * Write FlaxBartAttention * Add FlaxBartEncoderLayer * Add FlaxBartDecoderLayer and some typing * Add helepr function for FlaxBart * shift_tokens_right * _make_causal_mask * _expand_mask * Add PositionalEmbedding and fix init_std naming * Add FlaxBartPretrainedModel * Add FlaxBartEncoder * Add FlaxBartEncoder * Add FlaxBartEncoder among modules to be imported * YET WE CANNOT INITIALIZE THAT!! :( * Make BartEncoder working Change BartEncoder to instance of nn.Module so far * Add FlaxBartDecoder * Add FlaxBartModel * TODO to make model run -> Prepapre model inputs * Resolve padding * Add FlaxBartModel * Add FlaxBartModel into importable modules * Remove FlaxBartEncoder and FlaxBartDecoder from importable modules * make style; not properly working * make style; make quality not pass due to some import I left * Remove TODO for padding_idx in nn.Embed so far * Add FlaxBartForConditionalGeneration * Incorporate Flax model output classes, i.e. return_dict * Add another models and incorporate use_cache arg * Add FlaxBartForSequenceClassification and FlaxBartForQuestionAnswering * Incorporate use_cache arg from PyTorch implementation * Add all necessary Flax output utils * Add FlaxBartForCausalLM; not working yet' * Add minor improvements; still lacks some functionality * Update docs, src and tests * Add support of FlaxBart to docs/source * Fix some bugs in FlaxBart souce code * Add some neccessary tests for FlaxBart models - jit_compilation not passing * Fix tests and add test_head_masking * Fix tests for @jax.jit computation * Add test_head_masking * Migrate FlaxBart tests from jax.numpy to numpy * Remove FlaxBartForCausalLM * Clean repo * fix bart model weight structure * Fix FlaxBartForSequenceClassification Slicing is not possible to use below jit, therefore, selecting sentence representation from hidden_states must be changed. * Allow FlaxBartForSequenceClassification for testing pt_flax equivalence * Allow testing for FlaxBartForQA for pt_flax equivalence * Add a comment to FlaxBartForSequenceClassification + change noise from 1e-3 to 1e-6 * remove past_key_values * remove inputs_mebeds and make input_ids required * add position ids * re-write attention layer * fix dataclass * fix pos embeds and attention output * fix pos embeds * expose encode method * expose decode method * move docstring to top * add cache for causal attn layer * remove head masking for now * s2s greedy search first pass * boom boom * fix typos * fix greedy generate for bart * use encoder, decoder layers instead of num_hidden_layers * handle encoder_outputs * cleanup * simplify decoding * more clean-up * typos * Change header + add {decoder_,}position_ids into 2 models * add BartConfig * fix existing tests * add encode, decode methods * Fix shift_tokens_right for JIT compilation + clarify one condition * fix decode * encoder => encode * simplify generate * add tests for encode and decode * style * add tests for cache * fix equivalence tests * sample generate now works with seq2seq * generation tests * initialize dense layers * docstring and cleanup * quality * remove get/set input_embeddings * address Patricks suggestions * decode for every model, remove encoder_outputs from call * update tests accordingly * decode returns only decoder outputs and logits * fix arguments * doc encode, decode methods * correct base_model_prefix * fix test for seq classif model * fix docs Co-authored-by:Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Suraj Patil <surajp815@gmail.com>
-
Guido Novati authored
* Fix megatron_gpt2 attention block's causal mask. * compatibility with checkpoints created with recent versions of Megatron-LM * added integration test for the released Megatron-GPT2 model * code style changes * added option to megatron conversion script to read from config file Co-authored-by:Guido Novati <gnovati@nvidia.com>
-
- 11 Jun, 2021 1 commit
-
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * fix tests
-
- 10 Jun, 2021 3 commits
-
-
Jayendra authored
* adding vit for flax * added test for Flax-vit and some bug-fixes * overrided methods where variable changes were necessary for flax_vit test * added FlaxViTForImageClassification for test * Update src/transformers/models/vit/modeling_flax_vit.py Co-authored-by:
Suraj Patil <surajp815@gmail.com> * made changes suggested in PR * Adding jax-vit models for autoimport * swapping num_channels and height,width dimension * fixing the docstring for torch-like inputs for VIT * add model to main init * add docs * doc, fix-copies * docstrings * small test fixes * fix docs * fix docstr * Apply suggestions from code review Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * style Co-authored-by:
jayendra <jayendra@infocusp.in> Co-authored-by:
Suraj Patil <surajp815@gmail.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com>
-
Daniel Stancl authored
* Fix a condition in test_generate_with_head_masking * Fix usage of head_mask in bigbirg_pegasus * Fix head masking for speech2text * Resolve copy mismatch + drop unwanted print statement * Fix the condition
-
Tobias Norlund authored
* Resize with kept aspect ratio * Fixed failed test * Overload center_crop and resize methods instead * resize should handle non-PIL images * update slow test * Tensor => tensor Co-authored-by:patil-suraj <surajp815@gmail.com>
-
- 09 Jun, 2021 4 commits
-
-
Patrick von Platen authored
* fix_torch_device_generate_test * remove @ * fix tests
-
Stas Bekman authored
-
Anton Lozhkov authored
* Working quantizer forward * Working quantizer forward * Clean up unused model parts, test reproducibility * Working quantizer forward * Clean up unused model parts, test reproducibility * Remove custom outputs from the shared ones * correct conversion * correct bug * add first pretrain script * save intermediate * static shapes * save intermediate * finish first pretrain script version * more refactor * remove wanddb * refactor more * improve test * correct perplexity compute bug * finish model implementation * add to docs * finish docs * finish pretraining script * finish pretraining script * remove wandb * finish PR for merge * finish config * finish * make deepspeed work * Apply suggestions from code review Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * apply suggestions * fix flaky test Co-authored-by:
patrickvonplaten <patrick.v.platen@gmail.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Stas Bekman authored
* support more than 2 gpus * style
-