1. 15 Jun, 2021 1 commit
  2. 14 Jun, 2021 4 commits
    • Stas Bekman's avatar
      04028317
    • Vasudev Gupta's avatar
      Flax Big Bird (#11967) · d9c0d08f
      Vasudev Gupta authored
      
      
      * add flax bert
      
      * bert -> bigbird
      
      * original_full ported
      
      * add debugger
      
      * init block sparse
      
      * fix copies ; gelu_fast -> gelu_new
      
      * block sparse port
      
      * fix block sparse
      
      * block sparse working
      
      * all ckpts working
      
      * fix-copies
      
      * make quality
      
      * init tests
      
      * temporary fix for FlaxBigBirdForMultipleChoice
      
      * skip test_attention_outputs
      
      * fix
      
      * gelu_fast -> gelu_new ; fix multiple choice model
      
      * remove nsp
      
      * fix sequence classifier
      
      * fix
      
      * make quality
      
      * make fix-copies
      
      * finish
      
      * Delete debugger.ipynb
      
      * Update src/transformers/models/big_bird/modeling_flax_big_bird.py
      
      * make style
      
      * finish
      
      * bye bye jit flax tests
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      d9c0d08f
    • Will Rice's avatar
      Adding TFWav2Vec2Model (#11617) · d438eee0
      Will Rice authored
      
      
      * [WIP] Add TFWav2Vec2Model
      
      Work in progress for adding a tensorflow version of Wav2Vec2
      
      * feedback changes
      
      * small fix
      
      * Test Feedback Round 1
      
      * Add SpecAugment and CTC Loss
      
      * correct spec augment mask creation
      
      * docstring and correct copyright
      
      * correct bugs
      
      * remove bogus file
      
      * finish tests correction
      
      * del unnecessary layers
      
      * Update src/transformers/models/wav2vec2/modeling_tf_wav2vec2.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * make style
      
      * correct final bug
      
      * Feedback Changes
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      d438eee0
    • Daniel Stancl's avatar
      FlaxBart (#11537) · 4a51b1dd
      Daniel Stancl authored
      
      
      * Start working on FlaxBart
      
      * Create modeling_flax_bart.py
      
      * Write FlaxBartAttention
      
      * Add FlaxBartEncoderLayer
      
      * Add FlaxBartDecoderLayer and some typing
      
      * Add helepr function for FlaxBart
      
      * shift_tokens_right
      
      * _make_causal_mask
      
      * _expand_mask
      
      * Add PositionalEmbedding and fix init_std naming
      
      * Add FlaxBartPretrainedModel
      
      * Add FlaxBartEncoder
      
      * Add FlaxBartEncoder
      
      * Add FlaxBartEncoder among modules to be imported
      
      * YET WE CANNOT INITIALIZE THAT!! :(
      
      * Make BartEncoder working
      
      Change BartEncoder to instance of nn.Module so far
      
      * Add FlaxBartDecoder
      
      * Add FlaxBartModel
      
      * TODO to make model run -> Prepapre model inputs
      
      * Resolve padding
      
      * Add FlaxBartModel
      
      * Add FlaxBartModel into importable modules
      
      * Remove FlaxBartEncoder and FlaxBartDecoder from importable modules
      
      * make style; not properly working
      
      * make style; make quality not pass due to some import I left
      
      * Remove TODO for padding_idx in nn.Embed so far
      
      * Add FlaxBartForConditionalGeneration
      
      * Incorporate Flax model output classes, i.e. return_dict
      
      * Add another models and incorporate use_cache arg
      
      * Add FlaxBartForSequenceClassification and FlaxBartForQuestionAnswering
      
      * Incorporate use_cache arg from PyTorch implementation
      
      * Add all necessary Flax output utils
      
      * Add FlaxBartForCausalLM; not working yet'
      
      * Add minor improvements; still lacks some functionality
      
      * Update docs, src and tests
      
      * Add support of FlaxBart to docs/source
      
      * Fix some bugs in FlaxBart souce code
      
      * Add some neccessary tests for FlaxBart models - jit_compilation not passing
      
      * Fix tests and add test_head_masking
      
      * Fix tests for @jax.jit computation
      
      * Add test_head_masking
      
      * Migrate FlaxBart tests from jax.numpy to numpy
      
      * Remove FlaxBartForCausalLM
      
      * Clean repo
      
      * fix bart model weight structure
      
      * Fix FlaxBartForSequenceClassification
      
      Slicing is not possible to use below jit, therefore, selecting sentence
      representation from hidden_states must be changed.
      
      * Allow FlaxBartForSequenceClassification for testing pt_flax equivalence
      
      * Allow testing for FlaxBartForQA for pt_flax equivalence
      
      * Add a comment to FlaxBartForSequenceClassification + change noise from 1e-3 to 1e-6
      
      * remove past_key_values
      
      * remove inputs_mebeds and make input_ids required
      
      * add position ids
      
      * re-write attention layer
      
      * fix dataclass
      
      * fix pos embeds and attention output
      
      * fix pos embeds
      
      * expose encode method
      
      * expose decode method
      
      * move docstring to top
      
      * add cache for causal attn layer
      
      * remove head masking for now
      
      * s2s greedy search first pass
      
      * boom boom
      
      * fix typos
      
      * fix greedy generate for bart
      
      * use encoder, decoder layers instead of num_hidden_layers
      
      * handle encoder_outputs
      
      * cleanup
      
      * simplify decoding
      
      * more clean-up
      
      * typos
      
      * Change header + add {decoder_,}position_ids into 2 models
      
      * add BartConfig
      
      * fix existing tests
      
      * add encode, decode methods
      
      * Fix shift_tokens_right for JIT compilation + clarify one condition
      
      * fix decode
      
      * encoder => encode
      
      * simplify generate
      
      * add tests for encode and decode
      
      * style
      
      * add tests for cache
      
      * fix equivalence tests
      
      * sample generate now works with seq2seq
      
      * generation tests
      
      * initialize dense layers
      
      * docstring and cleanup
      
      * quality
      
      * remove get/set input_embeddings
      
      * address Patricks suggestions
      
      * decode for every model, remove encoder_outputs from call
      
      * update tests accordingly
      
      * decode returns only decoder outputs and logits
      
      * fix arguments
      
      * doc encode, decode methods
      
      * correct base_model_prefix
      
      * fix test for seq classif model
      
      * fix docs
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      4a51b1dd
  3. 10 Jun, 2021 1 commit
  4. 09 Jun, 2021 2 commits
    • Anton Lozhkov's avatar
      Wav2Vec2 Pretraining (#11306) · d472bd7b
      Anton Lozhkov authored
      
      
      * Working quantizer forward
      
      * Working quantizer forward
      
      * Clean up unused model parts, test reproducibility
      
      * Working quantizer forward
      
      * Clean up unused model parts, test reproducibility
      
      * Remove custom outputs from the shared ones
      
      * correct conversion
      
      * correct bug
      
      * add first pretrain script
      
      * save intermediate
      
      * static shapes
      
      * save intermediate
      
      * finish first pretrain script version
      
      * more refactor
      
      * remove wanddb
      
      * refactor more
      
      * improve test
      
      * correct perplexity compute bug
      
      * finish model implementation
      
      * add to docs
      
      * finish docs
      
      * finish pretraining script
      
      * finish pretraining script
      
      * remove wandb
      
      * finish PR for merge
      
      * finish config
      
      * finish
      
      * make deepspeed work
      
      * Apply suggestions from code review
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * apply suggestions
      
      * fix flaky test
      Co-authored-by: default avatarpatrickvonplaten <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      d472bd7b
    • NielsRogge's avatar
      Add DETR (#11653) · d3eacbb8
      NielsRogge authored
      
      
      * Squash all commits of modeling_detr_v7 branch into one
      
      * Improve docs
      
      * Fix tests
      
      * Style
      
      * Improve docs some more and fix most tests
      
      * Fix slow tests of ViT, DeiT and DETR
      
      * Improve replacement of batch norm
      
      * Restructure timm backbone forward
      
      * Make DetrForSegmentation support any timm backbone
      
      * Fix name of output
      
      * Address most comments by @LysandreJik
      
      * Give better names for variables
      
      * Conditional imports + timm in setup.py
      
      * Address additional comments by @sgugger
      
      * Make style, add require_timm and require_vision to tests茅
      
      * Remove train_backbone attribute of DetrConfig, add methods to freeze/unfreeze backbone
      
      * Add png files to fixtures
      
      * Fix type hint
      
      * Add timm to workflows
      
      * Add `BatchNorm2d` to the weight initialization
      
      * Fix retain_grad test
      
      * Replace model checkpoints by Facebook namespace
      
      * Fix name of checkpoint in test
      
      * Add user-friendly message when scipy is not available
      
      * Address most comments by @patrickvonplaten
      
      * Remove return_intermediate_layers attribute of DetrConfig and simplify Joiner
      
      * Better initialization
      
      * Scipy is necessary to get sklearn metrics
      
      * Rename TimmBackbone to DetrTimmConvEncoder and rename DetrJoiner to DetrConvModel
      
      * Make style
      
      * Improve docs and add 2 community notebooks
      Co-authored-by: default avatarLysandre <lysandre.debut@reseau.eseo.fr>
      d3eacbb8
  5. 08 Jun, 2021 2 commits
  6. 04 Jun, 2021 1 commit
  7. 02 Jun, 2021 3 commits
  8. 01 Jun, 2021 5 commits
  9. 28 May, 2021 1 commit
  10. 27 May, 2021 1 commit
  11. 26 May, 2021 1 commit
    • Patrick von Platen's avatar
      Flax Generate (#11777) · 996a315e
      Patrick von Platen authored
      
      
      * fix_torch_device_generate_test
      
      * remove @
      
      * add
      
      * indexing
      
      * correct a couple of tests
      
      * fix tests
      
      * add logits processor
      
      * finish top_k, top_p, temp
      
      * add docs
      
      * correct flax prng key default
      
      * improve generate
      
      * add generation docs
      
      * add docs
      
      * make style
      
      * revert model outputs change
      
      * make style
      
      * correct typo
      
      * fix tests
      
      * fix slow test
      
      * add raise
      
      * finish generation
      Co-authored-by: default avatarPatrick von Platen <patrick@huggingface.co>
      996a315e
  12. 24 May, 2021 1 commit
  13. 20 May, 2021 1 commit
  14. 18 May, 2021 3 commits
  15. 17 May, 2021 1 commit
  16. 13 May, 2021 1 commit
  17. 12 May, 2021 4 commits
    • NielsRogge's avatar
      Vit deit fixes (#11309) · fa84540e
      NielsRogge authored
      
      
      * Improve docs of DeiT and ViT, add community notebook
      
      * Add gitignore for test_samples
      
      * Add notebook with Trainer
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      fa84540e
    • Lysandre's avatar
      Docs for v4.7.0.dev0 · d77eb0cf
      Lysandre authored
      d77eb0cf
    • Suraj Patil's avatar
      Fix clip docs (#11694) · f063c56d
      Suraj Patil authored
      * fix doc url
      
      * fix example
      f063c56d
    • Suraj Patil's avatar
      CLIP (#11445) · 8719afa1
      Suraj Patil authored
      
      
      * begin second draft
      
      * fix import, style
      
      * add loss
      
      * fix embeds, logits_scale, and projection
      
      * fix imports
      
      * add conversion script
      
      * add feature_extractor and processor
      
      * style
      
      * add tests for tokenizer, extractor and processor
      
      * add vision model tests
      
      * add weight init
      
      * add more tests
      
      * fix save_load  test
      
      * model output, dosstrings, causal mask
      
      * config doc
      
      * add clip model tests
      
      * return dict
      
      * bigin integration test
      
      * add integration tests
      
      * fix-copies
      
      * fix init
      
      * Clip => CLIP
      
      * fix module name
      
      * docs
      
      * fix doc
      
      * output_dim => projection_dim
      
      * fix checkpoint names
      
      * remoe fast tokenizer file
      
      * fix conversion script
      
      * fix tests, quality
      
      * put causal mask on device
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * fix attribute test
      
      * style
      
      * address sylvains comments
      
      * style
      
      * fix docstrings
      
      * add qucik_gelu in activations, docstrings
      
      * clean-up attention test
      
      * fix act fun
      
      * fix config
      
      * fix torchscript tests
      
      * even batch_size
      
      * remove comment
      
      * fix ouput tu_tuple
      
      * fix save load tests
      
      * fix add tokens test
      
      * add fast tokenizer
      
      * update copyright
      
      * new processor API
      
      * fix docs
      
      * docstrings
      
      * docs
      
      * fix doc
      
      * fix doc
      
      * fix tokenizer
      
      * fix import in doc example
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * check types of config
      
      * valhalla => openai
      
      * load image using url
      
      * fix test
      
      * typo
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      8719afa1
  18. 10 May, 2021 2 commits
    • Vasudev Gupta's avatar
      Update community.md (#11654) · 575c9791
      Vasudev Gupta authored
      575c9791
    • Tanmay Laud's avatar
      Big Bird Fast Tokenizer implementation (#11075) · f7f87295
      Tanmay Laud authored
      
      
      * Added Big Bird Fast Tokenizer initial file
      
      * style fixes
      
      * flake fixes
      
      * Added big bird fast tokenizer to init files
      
      * Added big bird fast to Auto tokenization
      
      * fix styles
      
      * minor quality fixes
      
      * Added initial test code
      
      * Fix SpmConverter when precompiled_charsmap doesn't exist
      
      * fixed post processor
      
      * minor style fix
      
      * minor fix input names
      
      * Actually fix identity normalization
      
      * style
      
      * Added token type ids to fast tokenizer
      
      * style
      
      * flake fix
      
      * fix copies
      Co-authored-by: default avatarAnthony MOI <m.anthony.moi@gmail.com>
      f7f87295
  19. 07 May, 2021 2 commits
  20. 04 May, 2021 3 commits