- 31 Mar, 2021 3 commits
-
-
Patrick von Platen authored
-
Patrick von Platen authored
-
Patrick von Platen authored
* add first code structures * add all bert models * add to init and docs * correct docs * make style
-
- 30 Mar, 2021 4 commits
-
-
Philipp Schmid authored
* improved branch usage * fixed grammar and comma
-
Suraj Patil authored
* fix checkpoint names * auto model * fix doc
-
Suraj Patil authored
* lets begin * boom boom * fix out proj in attn * fix attention * fix local attention * add tokenizer * fix imports * autotokenizer * fix checkpoint name * cleanup * more clean-up * more cleanup * output attentions * fix attn mask creation * fix imports * config doc * add tests * add slow tests * quality * add conversion script * copyright * typo * another bites the dust * fix attention tests * doc * add embed init in convert function * fix copies * remove tokenizer * enable caching * address review comments * improve config and create attn layer list internally * more consistent naming * init hf config from mesh-tf config json file * remove neo tokenizer from doc * handle attention_mask in local attn layer * attn_layers => attention_layers * add tokenizer_class in config * fix docstring * raise if len of attention_layers is not same as num_layers * remove tokenizer_class from config * more consistent naming * fix doc * fix checkpoint names * fp16 compat * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Vasudev Gupta authored
* init bigbird * model.__init__ working, conversion script ready, config updated * add conversion script * BigBirdEmbeddings working :) * slightly update conversion script * BigBirdAttention working :) ; some bug in layer.output.dense * add debugger-notebook * forward() working for BigBirdModel :) ; replaced gelu with gelu_fast * tf code adapted to torch till rand_attn in bigbird_block_sparse_attention ; till now everything working :) * BigBirdModel working in block-sparse attention mode :) * add BigBirdForPreTraining * small fix * add tokenizer for BigBirdModel * fix config & hence modeling * fix base prefix * init testing * init tokenizer test * pos_embed must be absolute, attn_type=original_full when add_cross_attn=True , nsp loss is optional in BigBirdForPreTraining, add assert statements * remove position_embedding_type arg * complete normal tests * add comments to block sparse attention * add attn_probs for sliding & global tokens * create fn for block sparse attn mask creation * add special tests * restore pos embed arg * minor fix * attn probs update * make big bird fully gpu friendly * fix tests * remove pruning * correct tokenzier & minor fixes * update conversion script , remove norm_type * tokenizer-inference test add * remove extra comments * add docs * save intermediate * finish trivia_qa conversion * small update to forward * correct qa and layer * better error message * BigBird QA ready * fix rebased * add triva-qa debugger notebook * qa setup * fixed till embeddings * some issue in q/k/v_layer * fix bug in conversion-script * fixed till self-attn * qa fixed except layer norm * add qa end2end test * fix gradient ckpting ; other qa test * speed-up big bird a bit * hub_id=google * clean up * make quality * speed up einsum with bmm * finish perf improvements for big bird * remove wav2vec2 tok * fix tokenizer * include docs * correct docs * add helper to auto pad block size * make style * remove fast tokenizer for now * fix some * add pad test * finish * fix some bugs * fix another bug * fix buffer tokens * fix comment and merge from master * add comments * make style * commit some suggestions Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix typos * fix some more suggestions * add another patch Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix copies * another path Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * update * update nit suggestions * make style Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 29 Mar, 2021 1 commit
-
-
Sylvain Gugger authored
* Instantiate model only once in pipeline * Remove documentation of deprecated method * Add FutureWarning * Update src/transformers/pipelines/base.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
- 26 Mar, 2021 2 commits
-
-
Sylvain Gugger authored
* Add ImageFeatureExtractionMixin * Add dummy vision objects * Add require_vision * Add tests * Fix test
-
Tomy Hsieh authored
* Rename NLP library to Datasets library * Update github template * Fix styling
-
- 25 Mar, 2021 2 commits
-
-
Amir Tahmasbi authored
* Added embeddings layer * Added layoutlm layers, main model, maskedlm and token classification classes * Added model classes to tf auto models * Added model to PT to TF conversion script * Added model to doc README * Added tests * Removed unused imports * Added layoutlm model, test, and doc for sequence classification, and fix imports in __init__.py * Made tests pass! * Fixed typos in imports and docs * Fixed a typo in embeddings layer * Removed imports * Fixed formatting issues, imports, tests * Added layoutlm layers, main model, maskedlm and token classification classes * Added model classes to tf auto models * Added model to PT to TF conversion script * Removed unused imports * Added layoutlm model, test, and doc for sequence classification, and fix imports in __init__.py * Made tests pass! * Fixed typos in imports and docs * Removed imports * Fixed small formatting issues * Removed duplicates import from main __init__.py * Chnaged deafult arg to true for adding pooling layer to tf layoutlm * Fixed formatting issues * Style * Added copied from to classes copied from bert * Fixed doc strings examples to work with layoutlm inputs * Removed PyTorch reference in doc strings example * Added integration tests * Cleaned up initialization file * Updated model checkpoint identifiers * Fixed imports Co-authored-by:
Amir Tahmasbi <amir@ehsai.ca> Co-authored-by:
Lysandre <lysandre.debut@reseau.eseo.fr>
-
Philipp Schmid authored
-
- 24 Mar, 2021 1 commit
-
-
Eliza Szczechla authored
Co-authored-by:Eliza <eliza@habanero.tiger.com.pl>
-
- 23 Mar, 2021 1 commit
-
-
Philipp Schmid authored
* added finished documentation * changed version from 1.6 to 1.6.0 for distributed * updated versions * updated urls
-
- 22 Mar, 2021 1 commit
-
-
Patrick von Platen authored
* push * finish * finish * make fix copies * change name
-
- 21 Mar, 2021 1 commit
-
-
Eric Lam authored
* Add new community notebook - wav2vec2 with GPT * Update:community.md, new nb add * feat: notebook of wav2vec xlsr ctc decoding with gpt logit adjustment * Update: Wav2vec2 CTC decoding with gpt2 adjustment * Update docs/source/community.md Co-authored-by:Suraj Patil <surajp815@gmail.com>
-
- 18 Mar, 2021 1 commit
-
-
Sylvain Gugger authored
-
- 17 Mar, 2021 1 commit
-
-
Stas Bekman authored
* [doc] [testing] extend -k section This PR adds more examples on using `pytest -k` - I always forget that I want to use `-k A OR B` when I want several tests - I keep trying AND and it doesn't match any. * style
-
- 16 Mar, 2021 6 commits
-
-
Cheng Li authored
* pass hf optimizer and scheduler to deepspeed if not specified in ds config * pass hf optimizer and scheduler to deepspeed if not specified in ds config * update * make init_deepspeed support config dict * fix docstring formatting * clean up trainer's comments * add new tests * fix type * composit argparse doesn't work * style * add a new test, rename others * document new functionality * complete tests, add docs * style * correct level * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add new methods to the doc * must tell DS we are using a non-native optimizer * add protection against cpu_offload + HF optimizer combo * fix the cli overrides * sync docs + tests * restore AdamW * better docs * need new version * no longer needed * remove outdate information * refactor duplicated code Co-authored-by:
Stas Bekman <stas@stason.org> Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Lysandre authored
-
Lysandre authored
-
Lysandre authored
-
Suraj Patil authored
-
Lysandre Debut authored
-
- 15 Mar, 2021 1 commit
-
-
Th茅o Matussi猫re authored
* split seq2seq script, update docs * needless diff * fix readme * remove test diff * s/summarization/translation Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * cr * fix arguments & better mbart/t5 refs * copyright Co-authored-by:
Suraj Patil <surajp815@gmail.com> * reword readme Co-authored-by:
Suraj Patil <surajp815@gmail.com> * s/summarization/translation * short script names * fix tests * fix isort, include mbart doc * delete old script, update tests * automate source prefix * automate source prefix for translation * s/translation/trans Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * fix script name (short version) * typos Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * exact parameter Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com> * remove superfluous source_prefix calls in docs * rename scripts & warn for source prefix * black * flake8 Co-authored-by:
theo <theo@matussie.re> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Suraj Patil <surajp815@gmail.com> Co-authored-by:
Stas Bekman <stas00@users.noreply.github.com>
-
- 12 Mar, 2021 3 commits
-
-
Stas Bekman authored
-
Sylvain Gugger authored
* Add auto_wrap option in fairscale integration * Style
-
Nicolas Patry authored
* [WIP] Adding new parameter to `generate`: `max_time`. Generation by tokens number is sometimes a bit clunky because we don't know how many tokens are good enough or even how many tokens are in the payload (for pipelines users for instance). This leads to hard to understand behavior. This PR proposes a new argument `max_time` which is a float of seconds for the allowed time for `generate` to run on. Ideally combinations of `max_tokens=None`, `max_time=2` could be used to generate as many tokens as possible within time budget. NB: Another possible approach consists of passing a callback to `generate` putting the caller in charge of the actual decision of when to stop generating tokens. It opens the door to 'which args should we pass' to this callback. It's hard to imagine other use-cases for this early stopping behavior than time (that are not already covered by parameters of generate) * Revamp with StoppingCriteria * Removing deprecated mentions. * Forgot arguments to stopping criteria. * Readding max_length it's not just used as a stopping criteria. * Default value for `stopping_criteria`. * Address @patrickvonplaten comments. - More docstrings - Actual doc - Include in global namespace - Remove TF work. * Put back `max_length` (deprecation different PR). * Doc quality. * Fixing old behavior without `stopping_criteria` but with `max_length`. Making sure we don't break that in the future. * Adding more tests for possible inconsistencies between `max_length` and `stopping_criteria`. * Fixing the torch imports.
-
- 11 Mar, 2021 3 commits
-
-
WybeKoper authored
* Fixed broken link * fixed max length violation Co-authored-by:WybeKoper <WybeKoper@users.noreply.github.com>
-
Suraj Patil authored
-
Patrick von Platen authored
* add conversion script * add wav2vec2 xslr models * finish * Update docs/source/model_doc/xlsr_wav2vec2.rst Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 10 Mar, 2021 2 commits
-
-
Sylvain Gugger authored
-
Suraj Patil authored
* s2t * fix config * conversion script * fix import * add tokenizer * fix tok init * fix tokenizer * first version working * fix embeds * fix lm head * remove extra heads * fix convert script * handle encoder attn mask * style * better enc attn mask * override _prepare_attention_mask_for_generation * handle attn_maks in encoder and decoder * input_ids => input_features * enable use_cache * remove old code * expand embeddings if needed * remove logits bias * masked_lm_loss => loss * hack tokenizer to support feature processing * fix model_input_names * style * fix error message * doc * remove inputs_embeds * remove input_embeds * remove unnecessary docstring * quality * SpeechToText => Speech2Text * style * remove shared_embeds * subsample => conv * remove Speech2TextTransformerDecoderWrapper * update output_lengths formula * fix table * remove max_position_embeddings * update conversion scripts * add possibility to do upper case for now * add FeatureExtractor and Processor * add tests for extractor * require_torch_audio => require_torchaudio * add processor test * update import * remove classification head * attention mask is now 1D * update docstrings * attention mask should be of type long * handle attention mask from generate * alwyas return attention_mask * fix test * style * doc * Speech2TextTransformer => Speech2Text * Speech2TextTransformerConfig => Speech2TextConfig * remove dummy_inputs * nit * style * multilinguial tok * fix tokenizer * add tgt_lang setter * save lang_codes * fix tokenizer * add forced_bos_token_id to tokenizer * apply review suggestions * add torchaudio to extra deps * add speech deps to CI * fix dep * add libsndfile to ci * libsndfile1 * add speech to extras all * libsndfile1 -> libsndfile1 * libsndfile * libsndfile1-dev * apt update * add sudo to install * update deps table * install libsndfile1-dev on CI * tuple to list * init conv layer * add model tests * quality * add integration tests * skip_special_tokens * add speech_to_text_transformer in toctree * fix tokenizer * fix fp16 tests * add tokenizer tests * fix copyright * input_values => input_features * doc * add model in readme * doc * change checkpoint names * fix copyright * fix code example * add max_model_input_sizes in tokenizer * fix integration tests * add do_lower_case to tokenizer * remove clamp trick * fix "Add modeling imports here" * fix copyrights * fix tests * SpeechToTextTransformer => SpeechToText * fix naming * fix table formatting * fix typo * style * fix typos * remove speech dep from extras[testing] * fix copies * rename doc file, * put imports under is_torch_available * run feat extract tests when torch is available * dummy objects for processor and extractor * fix imports in tests * fix import in modeling test * fxi imports * fix torch import * fix imports again * fix positional embeddings * fix typo in import * adapt new extractor refactor * style * fix torchscript test * doc * doc * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * fix docs, copied from, style * fix docstring * handle imports * remove speech from all extra deps * remove s2t from seq2seq lm mapping * better names * skip training tests * add install instructions * List => Tuple * doc * fix conversion script * fix urls * add instruction for libsndfile * fix fp16 test Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 09 Mar, 2021 2 commits
-
-
Patrick von Platen authored
* save first version * finish refactor * finish refactor * correct naming * correct naming * shorter names * Update src/transformers/feature_extraction_common_utils.py Co-authored-by:
Lysandre Debut <lysandre@huggingface.co> * change name * finish Co-authored-by:
Lysandre Debut <lysandre@huggingface.co>
-
Stas Bekman authored
* How to solve: Title level inconsistent * list chars
-
- 08 Mar, 2021 2 commits
-
-
Ratthachat (Jung) authored
* Create modeling_tf_dpr.py * Add TFDPR * Add back TFPegasus, TFMarian, TFMBart, TFBlenderBot last commit accidentally deleted these 4 lines, so I recover them back * Add TFDPR * Add TFDPR * clean up some comments, add TF input-style doc string * Add TFDPR * Make return_dict=False as default * Fix return_dict bug (in .from_pretrained) * Add get_input_embeddings() * Create test_modeling_tf_dpr.py The current version is already passed all 27 tests! Please see the test run at : https://colab.research.google.com/drive/1czS_m9zy5k-iSJbzA_DP1k1xAAC_sdkf?usp=sharing * fix quality * delete init weights * run fix copies * fix repo consis * del config_class, load_tf_weights They shoud be 'pytorch only' * add config_class back after removing it, test failed ... so totally only removing "use_tf_weights = None" on Lysandre suggestion * newline after .. note:: * import tf, np (Necessary for ModelIntegrationTest) * slow_test from_pretrained with from_pt=True At the moment we don't have TF weights (since we don't have official official TF model) Previously, I did not run slow test, so I missed this bug * Add simple TFDPRModelIntegrationTest Note that this is just a test that TF and Pytorch gives approx. the same output. However, I could not test with the official DPR repo's output yet * upload correct tf model * remove position_ids as missing keys * create modeling_tf_rag * add tests for tf * add tf tests * revert wrong pt commit * further refactor * further refactor * refactor * Update modeling_tf_rag.py - input_processing - fix prepare_input_for_generation (mostly fix generate bug) - bring back from_pretrained hack in order to test generate * delete colab pieces of code * Show case of greedy "generate" Temporarily change from beam_search test to greedy_search test to show case that TF and PT do get equivalent output. * cosmetic update * correct typos * update * push some progress * make easy check * fix rag save from pretrained * Update src/transformers/modeling_tf_utils.py * remove commented out lines * delete unnecessary lines * add simple test case for nq_checkpoint Add nq_checkpoint test to show that current version without hack still fails * temporarily put ugly hack back again * Add TFRagSequenceForGeneration!! * __init__.py , import TFRagSequenceForGeneration * Add TFRagSequence tests! * rag init.py - add TFRagSequenceForGeneration * fix from_pretrained * fix prepare_inputs_for_generation * Beam search for RagToken! * minor clean up * add tf.cast in TFRagModel * More tf.cast * Add all remaining tests (still have issues) * delete all T5 related * make style * fix load weight prefix * fix bart * fix return_dict for tf_rag make all tests pass .. Hooray * fix some tests * fix code quality * fix qualtiy check * finish tests tf rag * add tf rag to docs * remove TFT5 from docstring Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * remove TFT5 from docstring Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * Delete outdated comments Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> * improve doc strings * add generative model classes * fix adjust token logic * refactor generate for TFRag * using shape_list, not _get_shape Co-authored-by:
Julien Plu <plu.julien@gmail.com> * axis=[1]->axis=1 * delete NEED_HELP comment * improve readability Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * improve readability Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * improve readability Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Indicating model is in a developing state in docstrings As suggested by Julien * small last changes * apply sylvains suggestions * finish tf rag Co-authored-by:
Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by:
patrickvonplaten <patrick@huggingface.co> Co-authored-by:
Julien Plu <plu.julien@gmail.com> Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
Sylvain Gugger authored
* Fix version control with anchors * Simplify
-
- 06 Mar, 2021 2 commits
-
-
Suraj Patil authored
* m2m_100 * no layernorm_embedding * sinusoidal positional embeddings * update pos embeddings * add default config values * tokenizer * add conversion script * fix config * fix pos embed * remove _float_tensor * update tokenizer * update lang codes * handle lang codes * fix pos embeds * fix spm key * put embedding weights on device * remove qa and seq classification heads * fix convert script * lang codes pn one line * fix embeds * fix tokenizer * fix tokenizer * add fast tokenizer * style * M2M100MT => M2M100 * fix copyright, style * tokenizer converter * vocab file * remove fast tokenizer * fix embeds * fix tokenizer * fix tests * add tokenizer tests * add integration test * quality * fix model name * fix test * doc * doc * fix doc * add copied from statements * fix tokenizer tests * apply review suggestions * fix urls * fix shift_tokens_right * apply review suggestions * fix * fix doc * add lang code to id * remove unused function * update checkpoint names * fix copy * fix tokenizer * fix checkpoint names * fix merge issue * style
-
Stas Bekman authored
* offline mode start * add specific values * fix fallback * add test * better values check and range * test that actually works * document the offline mode * Apply suggestions from code review Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * more strict check * cleaner test * pt-only test * style Co-authored-by:
Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
-
- 05 Mar, 2021 1 commit
-
-
lewtun authored
-