- 08 Feb, 2021 21 commits
-
-
Lysandre authored
-
Stas Bekman authored
* deepspeed bug fixes and tests * manual wrap?
-
Anthony MOI authored
-
noise-field authored
* Unify logging with f-strings * Get limits from MLflow rather than hardcode * Add a check for parameter length overflow Also constants are marked as internal * Don't stop run in on_train_end This causes bad behaviour when there is a seprarte validation step: validation gets recorded as separate run. * Fix style
-
Olivier authored
* replace -100 token ids with the tokenizer pad_id for compute_metrics * fixed typo for label_ids
-
Lysandre Debut authored
-
demSd authored
* claiming this issue * Integration test for BertGeneration(Encoder and Decoder) * fix code quality
-
Julien Plu authored
* Fix template * Fix template
-
Patrick von Platen authored
-
Julien Plu authored
* Refacto BERT * Restore all the concerned models * Remove print * Update template * Apply Sylvain's and Morgan's comments * Fix cast * Put the cast inside call * Remove cond in ebds * Fix funnel * Restore previous dot product (attention_scores) computation * Add ConvBERT and BART * Make all the S2S models ONNX compliant * Fix test * Fix check copies
-
Julien Plu authored
* Disable temporarily too slow tests * Fix style * Fix template
-
Nicolas Patry authored
* Cleaning up `ConversationalPipeline` to support more than DialoGPT. Currently ConversationalPipeline was heavily biased towards DialoGPT ,which is the default model for this pipeline. This PR proposes changes to put back the modifications specific to DialoGPT into tokenizer-specific behavior wherever possible, by creating `_build_conversation_input_ids` function that takes conversation as input, and returns a list of ints corresponding to the tokens. It feels natural to put here because all models have probably different strategies to build input_ids from the full conversation and it's the tokenizer's job to transform strings into tokens (and vice-versa) If `_build_conversation_input_ids` is missing, previous behavior is used so we don't break anything so far (except for blenderbot where it's a fix). This PR also contains a fix for too long inputs. There used to be dead code for trying to limit the size of incoming input. The introduced fixed is that we limit within `_build_conversation_input_ids` to `tokenizer.model_max_length`. It corresponds to the intent of the removed dead code and is actually better because it corresponds to `model_max_length` which is different from `max_length` (which is a default parameter for `generate`). - Removed `history` logic from the Conversation as it's not relevant anymore because tokenization logic has been moved to tokenizer. And tokenizer cannot save any cache, and conversation cannot know what is relevant or not. Also it's not usable from `blenderbot` because the input_ids are not append only (EOS tokens is always at the end). - Added `iter_texts` method on `Conversation` because all the code was literred with some form of this iteration of past/generated_responses. * Removing torch mention in types. * Adding type checking to `_build_conversation_input_ids`. * Fixing import in strings.
-
Lysandre Debut authored
-
Patrick von Platen authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Lysandre Debut authored
* Correct cast to device * Comment back the slow test
-
sandip authored
-
Stas Bekman authored
-
Stas Bekman authored
-
- 05 Feb, 2021 7 commits
-
-
Stas Bekman authored
* make executable * make executable * same for the template * cleanup
-
Suraj Patil authored
* add prepare_decoder_input_ids_from_labels in s2s models * support lbl smoothing and enc/emb freezing * fix freezing * use pad_token_id from config * remove embed freezing and add warning * prepare decoder_input_ids inside DataCollatorForSeq2Seq
-
Patrick von Platen authored
* Bump minimum Jax requirement to 2.8.0 * update table
-
Patrick von Platen authored
* add big bird * change teacher to mentor * add proposal template * adapt template * delete old template * correct some links * finish template * create big bird from template * add big bird * improve boxes * finish boxes * add pointers for BigBird * finish big bird * up * up * up * up * apply lysandres and sylvains suggestions * delete bogus file * correct markdown * try different style * try different style * finalize
-
Lysandre Debut authored
* Clarify QA pipeline output based on character * Style
-
Lysandre authored
-
Lysandre authored
-
- 04 Feb, 2021 12 commits
-
-
Sylvain Gugger authored
* Update doc for pre-release * Use stable as default * Use the right commit :facepalms:
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
-
Sylvain Gugger authored
* Authorize last version of tokenizer * Update version table * Fix conversion of spm tokenizers and fix some hub links * Bump tokenizers version to 0.10.1rc1 * Add script to check tokenizers conversion with XNLI * Add some more mask_token lstrip support * Must modify mask_token in slow tokenizers too * Keep using the old method for Pegasus * add missing import Co-authored-by:Anthony MOI <m.anthony.moi@gmail.com>
-
Nicolas Patry authored
`encoder_no_repeat_ngram_size` from their config.
-
Stas Bekman authored
* trainer fixes * don't switch the model just for deepspeed and mp * correct the fix
-
Daniel Stancl authored
* Replace `attn_weights = attn_wegihts = tf.reshape(...)` with `attn_weights = tf.reshape(...)` and thus remove unintentionally used "double" assignment.
-
Sylvain Gugger authored
-
Nicolas Patry authored
Adding new `encoder_no_repeat_ngram_size` to `generate`. Blenderbot results seemed off compared to original ParlAI script: `https://parl.ai/projects/recipes/` . Notably the model seems to repeat a lot what was said during the conversation. The actual problem was that `no_repeat_ngram_size` actually applies to the `encoder_input_ids` but HF's `no_repeat_ngram_size` applies to the previously generated ids (within the decoder). The history conversation of blenderbot is within the `encoder` part so that explains why HF's implementation had the repetitions. This fix was focused on blenderbot *not* small and added tests for those because they are quite different in configuration. This change includes: - Adding a new EncoderNoRepeatLogitProcessor. - Adding 1 new arg to `generate` (`encoder_no_repeat_ngram_size`) - Adding 1 new config parameter `encoder_no_repeat_ngram_size`. - Adding 2 tests, one for the pipeline (high level, inputs exhibited repeat behavior, one low level for EncoderNoRepeatLogitProcessor) - Factored NoRepeatLogitProcessor so that logic could be reused. Further work: - Blenderbot conversational pipeline still does not behave correctly as they way input is prepared within the pipeline is still incorrect (follow up PR) - Blenderbot allows the bot to have personas, which is done by prepending "your personna: XXXX" to the input, this could be explored too in a follow up PR. @patrickvonplaten @LysandreJik * Update src/transformers/generation_logits_process.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/generation_utils.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/generation_utils.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Update src/transformers/configuration_utils.py Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Doc quality. * Fixing test. * Last fixes. * Fixing to account for batch_size. * Update src/transformers/configuration_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/generation_utils.py Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Lysandre Debut authored
-
Daniel Hug authored
-