1. 20 Feb, 2023 1 commit
  2. 10 Feb, 2023 1 commit
    • Jannis Vamvas's avatar
      Add X-MOD (#20939) · b0d539cc
      Jannis Vamvas authored
      
      
      * Add X-MOD to Readme
      
      * Add documentation for X-MOD
      
      * Implement X-MOD
      
      * Fix formatting of X-MOD docs
      
      * Change signature of X-MOD forward methods to use lang_ids
      
      * Minor changes
      
      * Rebase with main and run make fix-copies
      
      * Make suggested changes to docstrings
      
      * Improve code readability
      Co-authored-by: default avatarYounes Belkada <49240599+younesbelkada@users.noreply.github.com>
      
      * Fix code style
      
      * Conversion script: Remove asserts and type annotations
      
      * Remove _TOKENIZER_FOR_DOC
      
      * XMOD -> Xmod
      
      * Update copyright note
      
      * Fix doctests
      
      * Fix docstring
      
      * Add integration test for FillMaskPipeline
      
      * Revert "Add integration test for FillMaskPipeline"
      
      This reverts commit 4381eb3b1d0f5d85785f89caba83928e6efa6d1f.
      
      * Add end-to-end integration test for mask fill
      
      * make style
      
      * Rebase with main and make fix-copies
      
      ---------
      Co-authored-by: default avatarYounes Belkada <49240599+younesbelkada@users.noreply.github.com>
      b0d539cc
  3. 23 Dec, 2022 1 commit
  4. 19 Dec, 2022 1 commit
  5. 12 Dec, 2022 1 commit
    • Ariel Ekgren's avatar
      Add gpt-sw3 model to transformers (#20209) · 5f94855d
      Ariel Ekgren authored
      
      
      * Add templates for gpt-sw3
      
      * Add templates for gpt-sw3
      
      * Added sentencepiece tokenizer
      
      * intermediate commit with many changes
      
      * fixed conflicts
      
      * Init commit for tokenization port
      
      * Tokenization progress
      
      * Remove fast tokenizer
      
      * Clean up and rename spm.model -> spiece.model
      
      * Remove TF -> PT conversion script template, Clean up Megatron -> PT script
      
      * Optimize encode & decode performance
      
      * added new attention
      
      * added new attention
      
      * attention for gpt-sw3 working
      
      * attention good
      
      * Cache is now working
      
      * fixed attention mask so that it works with causal attention
      
      * fixed badbmm bug for cpu and caching
      
      * updated config with correct parameters
      
      * Refactor and leave optimizations as separate functions to avoid breaking expected functionality
      
      * Fix special tokens mapping for both tokenizers
      
      * cleaning up of code and comments
      
      * HF compatible attention outputs
      
      * Tokenizer now passing tests, add documentation
      
      * Update documentation
      
      * reverted back to base implementation after checking that it is identical to pretrained model
      
      * updated gpt-sw3 config
      
      * updated conversion script
      
      * aligned parameters with gpt-sw3 config
      
      * changed default scale_attn_by_inverse_layer_idx to true
      
      * removed flag from conversion script
      
      * added temporary model path
      
      * reverted back to functioning convert script
      
      * small changes to default config
      
      * updated tests for gpt-sw3
      
      * make style, make quality, minor cleanup
      
      * Change local paths to testing online repository
      
      * Change name: GptSw3 -> GPTSw3
      
      * Remove GPTSw3TokenizerFast references
      
      * Use official model repository and add more model sizes
      
      * Added reference to 6.7b model
      
      * Add GPTSw3DoubleHeadsModel to IGNORE_NON_AUTO_CONFIGURED, like GPT2DoubleHeadsModel
      
      * Remove pointers to non-existing TFGPTSw3
      
      * Add GPTSw3 to docs/_toctree.yml
      
      * Remove TF artifacts from GPTSw3 in __init__ files
      
      * Update README:s with 'make fix-copies'
      
      * Add 20b model to archive list
      
      * Add documentation for GPT-Sw3
      
      * Fix typo in documentation for GPT-Sw3
      
      * Do 'make fix-copies' again after having updated docs
      
      * Fix some typos in docs
      
      * Update src/transformers/models/gpt_sw3/configuration_gpt_sw3.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/models/gpt_sw3/configuration_gpt_sw3.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/models/gpt_sw3/__init__.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/models/gpt_sw3/__init__.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/models/gpt_sw3/convert_megatron_to_pytorch.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update tests/models/gpt_sw3/test_tokenization_gpt_sw3.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Resolve comments from PR feedback
      
      * Resolve more comments from PR feedback, also set use_cache=True in convert script
      
      * Add '# Copied from' comments for GPTSw3 modeling
      
      * Set 'is_parallelizable = False'
      
      * Remove '# Copied from' where code was modified and add 'with x->y' when appropriate
      
      * Remove parallelize in mdx
      
      * make style, make quality
      
      * Update GPTSw3Config default values and corresponding documentation
      
      * Update src/transformers/models/gpt_sw3/tokenization_gpt_sw3.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/gpt_sw3/__init__.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Clean up and protect GPTSw3Tokenizer imports with is_sentencepiece_available
      
      * Make style, make quality
      
      * Add dummy object for GPTSw3Tokenizer via 'make fix-copies'
      
      * make fix-copies
      
      * Remove GPTSw3 modeling classes
      
      * make style, make quality
      
      * Add GPTSw3 auto-mappings for other GPT2 heads
      
      * Update docs/source/en/model_doc/gpt-sw3.mdx
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/models/gpt_sw3/convert_megatron_to_pytorch.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/models/gpt_sw3/tokenization_gpt_sw3.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Remove old TODO-comment
      
      * Add example usage to GPTSw3Tokenizer docstring
      
      * make style, make quality
      
      * Add implementation details and example usage to gpt-sw3.mdx
      Co-authored-by: default avatarJoeyOhman <joeyoh@kth.se>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      5f94855d
  6. 09 Dec, 2022 1 commit
  7. 05 Dec, 2022 1 commit
  8. 30 Nov, 2022 1 commit
    • Yang An's avatar
      Add Chinese-CLIP implementation (#20368) · 72176402
      Yang An authored
      
      
      * init chinese-clip model from clip
      
      * init model tests and docs
      
      * implement chinese-clip into hf
      
      * implement chinese-clip into hf
      
      * implement chinese-clip into hf
      
      * implement chinese-clip into hf
      
      * implement chinese-clip into hf
      
      * update usecase example in model implementation
      
      * fix codestyle
      
      * fix model_type typo in readme
      
      * add placeholder in doc
      
      * add placeholder in doc
      
      * update the init script
      
      * update usecase
      
      * fix codestyle
      
      * update testcase
      
      * update testcase
      
      * update testcase
      
      * update testcase
      
      * update testcase
      
      * update testcase
      
      * update testcase
      
      * update testcase
      
      * update testcase
      
      * update testcase
      
      * update testcase
      
      * update testcase
      
      * forward the convert_rgb
      
      * update testcase
      
      * update testcase
      
      * update testcase
      
      * merge the recent update from clip about model_input_name property
      
      * update the doc
      
      * update the doc
      
      * update the doc
      
      * update the doc
      
      * remove unused imports
      
      * reformat code style
      
      * update the doc
      
      * fix isort style
      
      * bypass a weird failed unit test which is unrelated with my PR
      
      * update the doc
      
      * implement independent vision config class
      
      * implement independent vision model class
      
      * fix refactor bug
      
      * fix refactor bug
      
      * fix refactor bug
      
      * make style
      
      * fix refactor bug
      
      * make style
      
      * fix refactor bug
      
      * fix refactor bug
      
      * make style
      
      * fix refactor bug
      
      * fix refactor bug
      
      * doc-build restyle
      
      * implement independent text config class
      
      * implement independent text model class
      
      * implement independent text model class
      
      * make style
      
      * make fix-copies
      
      * fix refactor bug
      
      * fix refactor bug
      
      * fix refactor bug
      
      * fix refactor bug
      
      * fix refactor bug
      
      * fix refactor bug
      
      * fix refactor bug
      
      * fix refactor bug
      
      * fix refactor bug
      
      * fix refactor bug
      
      * make style
      
      * update doc
      
      * black and isort
      
      * update doc
      
      * Update src/transformers/models/chinese_clip/configuration_chinese_clip.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/auto/tokenization_auto.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * modify the model type from chinese-clip to chinese_clip
      
      * format the example comment of ChineseCLIPVisionConfig
      
      * correct the copyright comment
      
      * fix the tokenizer specification
      
      * add copied from for loss function
      
      * remove unused class
      
      * update CHINESE_CLIP_TEXT_INPUTS_DOCSTRING
      
      * update CHINESE_CLIP_INPUTS_DOCSTRING
      
      * update doc
      
      * update doc
      
      * update code comment in config
      
      * update copied from statement
      
      * make style
      
      * rename the doc file
      
      * add copied statement
      
      * remove unused attention_mask, causal_attention_mask in ChineseCLIPVisionEncoder
      
      * remove ChineseCLIPTextPreTrainedModel
      
      * fix bug
      
      * fix bug
      
      * fix bug
      
      * update doc
      
      * make style
      
      * Update src/transformers/models/chinese_clip/configuration_chinese_clip.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/chinese_clip/configuration_chinese_clip.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * update ChineseCLIPImageProcessor in image_processing_auto
      
      * fix config_class of chinesecliptextmodel
      
      * fix the test case
      
      * update the docs
      
      * remove the copied from comment for ChineseCLIPTextModel, since it has diverged from BertModel with customed config_class
      
      * update the testcase
      
      * final fix
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      72176402
  9. 21 Nov, 2022 1 commit
    • Matthijs Hollemans's avatar
      add MobileNetV1 model (#17799) · d21c97cc
      Matthijs Hollemans authored
      * add model files etc for MobileNetV2
      
      rename files for MobileNetV1
      
      initial implementation of MobileNetV1
      
      fix conversion script
      
      cleanup
      
      write docs
      
      tweaks
      
      fix conversion script
      
      extract hidden states
      
      fix test cases
      
      make fixup
      
      fixup it all
      
      remove main from doc link
      
      fixes
      
      fix tests
      
      fix up
      
      use google org
      
      fix weird assert
      
      * fixup
      
      * use google organization for checkpoints
      d21c97cc
  10. 14 Nov, 2022 1 commit
    • Matthijs Hollemans's avatar
      add MobileNetV2 model (#17845) · f711d683
      Matthijs Hollemans authored
      * add model files etc for MobileNetV2
      
      * rename files for MobileNetV1
      
      * initial implementation of MobileNetV1
      
      * fix conversion script
      
      * cleanup
      
      * write docs
      
      * tweaks
      
      * fix conversion script
      
      * extract hidden states
      
      * fix test cases
      
      * make fixup
      
      * fixup it all
      
      * rename V1 to V2
      
      * fix checkpoints
      
      * fixup
      
      * implement first block + weight conversion
      
      * add remaining layers
      
      * add output stride and dilation
      
      * fixup
      
      * add tests
      
      * add deeplabv3+ head
      
      * a bit of fixup
      
      * finish deeplab conversion
      
      * add link to doc
      
      * fix issue with JIT trace
      
      in_height and in_width would be Tensor objects during JIT trace, which caused Core ML conversion to fail on the remainder op. By making them ints, the result of the padding calculation becomes a constant value.
      
      * cleanup
      
      * fix order of models
      
      * fix rebase error
      
      * remove main from doc link
      
      * add image processor
      
      * remove old feature extractor
      
      * fix converter + other issues
      
      * fixup
      
      * fix unit test
      
      * add to onnx tests (but these appear broken now)
      
      * add post_process_semantic_segmentation
      
      * use google org
      
      * remove unused imports
      
      * move args
      
      * replace weird assert
      f711d683
  11. 07 Nov, 2022 1 commit
  12. 01 Nov, 2022 1 commit
    • Mohit Sharma's avatar
      Added onnx config whisper (#19525) · c796b6de
      Mohit Sharma authored
      * Added onnx config whisper
      
      * added whisper support onnx
      
      * add audio input data
      
      * added whisper support onnx
      
      * fixed the seqlength value
      
      * Updated the whisper onnx ocnfig
      
      * restore files to old version
      
      * removed attention mask from inputs
      
      * Updated get_dummy_input_onnxruntime docstring
      
      * Updated relative imports and token generation
      
      * update docstring
      c796b6de
  13. 28 Oct, 2022 1 commit
  14. 18 Oct, 2022 1 commit
    • NielsRogge's avatar
      Add table transformer [v2] (#19614) · dd523da5
      NielsRogge authored
      * First draft
      
      * Add conversion script
      
      * Make conversion work
      
      * Upload checkpoints
      
      * Add final fixes
      
      * Revert changes of conditional and deformable detr
      
      * Fix toctree, add and remove copied from
      
      * Use model type
      
      * Improve docs
      
      * Improve code example
      
      * Update copies
      
      * Add copied formt
      
      * Don't update conditional detr
      
      * Don't update deformable detr
      dd523da5
  15. 10 Oct, 2022 1 commit
  16. 07 Oct, 2022 1 commit
  17. 03 Oct, 2022 1 commit
  18. 22 Sep, 2022 1 commit
  19. 09 Sep, 2022 1 commit
  20. 31 Aug, 2022 1 commit
  21. 30 Aug, 2022 2 commits
  22. 25 Aug, 2022 1 commit
    • Patrick Deutschmann's avatar
      Add ONNX support for Longformer (#17176) · 3223d493
      Patrick Deutschmann authored
      * Implement ONNX support for Longformer
      
      Fix repo consistency check complaints
      
      Fix value mismatches
      
      Add pooler output for default model
      
      Increase validation atol to accommodate multiple-choice error
      
      Fix copies
      
      Fix chunking for longer sequence lengths
      
      Add future comment
      
      * Fix issue in mask_invalid_locations
      
      * Remove torch imports in configuration_longformer
      
      * Change config access to fix LED
      
      * Push opset version to support tril
      
      * Work in review comments (mostly style)
      
      * Add Longformer to ONNX tests
      3223d493
  23. 10 Aug, 2022 1 commit
  24. 09 Aug, 2022 1 commit
    • Thomas Chaigneau's avatar
      Add mt5 onnx config (#18394) · 8cb5ecd9
      Thomas Chaigneau authored
      * update features
      
      * MT5OnnxConfig added with updated with tests and docs
      
      * fix imports
      
      * fix onnc_config_cls for mt5
      
      Co-authored-by: Thomas Chaigneau <thomas.deeptools.ai>
      8cb5ecd9
  25. 18 Jul, 2022 1 commit
  26. 06 Jul, 2022 1 commit
  27. 01 Jul, 2022 1 commit
  28. 30 Jun, 2022 1 commit
  29. 29 Jun, 2022 2 commits
  30. 28 Jun, 2022 1 commit
  31. 24 Jun, 2022 1 commit
    • rooa's avatar
      Add CodeGen model (#17443) · d6b6fb99
      rooa authored
      
      
      * Add CodeGen model
      
      * Add missing key and switch order of super()
      
      * Fix torch.ones init with uint8 instead of bool
      
      * Address comments: copy statements and doc
      
      * update tests
      
      * remove old model parallel
      
      * fix batch gen tests
      
      * fix batch gen test
      
      * update test_gpt2_sample_max_time
      
      * fix codgen test and revert gpt2 test change
      
      * Fix incorrect tie_word_embedding value, typo, URL
      
      * Fix model order in README and styling
      
      * Reorder model list alphabetically
      
      * Set tie_word_embedding to False by default
      
      * Apply suggestions from code review
      
      * Better attn mask name & remove attn masked_bias
      
      * add tokenizer for codegen
      
      * quality
      
      * doc tokenizer
      
      * fix-copies
      
      * add CodeGenTokenizer in converter
      
      * make truncation optional
      
      * add test for truncation
      
      * add copyright
      
      * fix-copies
      
      * fix fast tokenizer decode
      
      * Update src/transformers/models/codegen/tokenization_codegen.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * increase vocab_size in tests
      Co-authored-by: default avatarpatil-suraj <surajp815@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      d6b6fb99
  32. 21 Jun, 2022 1 commit
  33. 13 Jun, 2022 1 commit
    • Daniel Stancl's avatar
      Add `LongT5` model (#16792) · a72f1c9f
      Daniel Stancl authored
      
      
      * Initial commit
      
      * Make some fixes
      
      * Make PT model full forward pass
      
      * Drop TF & Flax implementation, fix copies etc
      
      * Add Flax model and update some corresponding stuff
      
      * Drop some TF things
      
      * Update config and flax local attn
      
      * Add encoder_attention_type to config
      
      * .
      
      * Update docs
      
      * Do some cleansing
      
      * Fix some issues -> make style; add some docs
      
      * Fix position_bias + mask addition + Update tests
      
      * Fix repo consistency
      
      * Fix model consistency by removing flax operation over attn_mask
      
      * [WIP] Add PT TGlobal LongT5
      
      * .
      
      * [WIP] Add flax tglobal model
      
      * [WIP] Update flax model to use the right attention type in the encoder
      
      * Fix flax tglobal model forward pass
      
      * Make the use of global_relative_attention_bias
      
      * Add test suites for TGlobal model
      
      * Fix minor bugs, clean code
      
      * Fix pt-flax equivalence though not convinced with correctness
      
      * Fix LocalAttn implementation to match the original impl. + update READMEs
      
      * Few updates
      
      * Update: [Flax] improve large model init and loading #16148
      
      * Add ckpt conversion script accoring to #16853 + handle torch device placement
      
      * Minor updates to conversion script.
      
      * Typo: AutoModelForSeq2SeqLM -> FlaxAutoModelForSeq2SeqLM
      
      * gpu support + dtype fix
      
      * Apply some suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * * Remove (de)parallelize stuff
      * Edit shape comments
      * Update README.md
      * make fix-copies
      
      * Remove caching logic for local & tglobal attention
      
      * Apply another batch of suggestions from code review
      
      * Add missing checkpoints
      * Format converting scripts
      * Drop (de)parallelize links from longT5 mdx
      
      * Fix converting script + revert config file change
      
      * Revert "Remove caching logic for local & tglobal attention"
      
      This reverts commit 2a619828f6ddc3e65bd9bb1725a12b77fa883a46.
      
      * Stash caching logic in Flax model
      
      * Make side relative bias used always
      
      * Drop caching logic in PT model
      
      * Return side bias as it was
      
      * Drop all remaining model parallel logic
      
      * Remove clamp statements
      
      * Move test files to the proper place
      
      * Update docs with new version of hf-doc-builder
      
      * Fix test imports
      
      * Make some minor improvements
      
      * Add missing checkpoints to docs
      * Make TGlobal model compatible with torch.onnx.export
      * Replace some np.ndarray with jnp.ndarray
      
      * Fix TGlobal for ONNX conversion + update docs
      
      * fix _make_global_fixed_block_ids and masked neg  value
      
      * update flax model
      
      * style and quality
      
      * fix imports
      
      * remove load_tf_weights_in_longt5 from init and fix copies
      
      * add slow test for TGlobal model
      
      * typo fix
      
      * Drop obsolete is_parallelizable and one warning
      
      * Update __init__ files to fix repo-consistency
      
      * fix pipeline test
      
      * Fix some device placements
      
      * [wip]: Update tests -- need to generate summaries to update expected_summary
      
      * Fix quality
      
      * Update LongT5 model card
      
      * Update (slow) summarization tests
      
      * make style
      
      * rename checkpoitns
      
      * finish
      
      * fix flax tests
      Co-authored-by: default avatarphungvanduy <pvduy23@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarpatil-suraj <surajp815@gmail.com>
      a72f1c9f
  34. 09 Jun, 2022 2 commits
  35. 03 Jun, 2022 1 commit
  36. 02 Jun, 2022 1 commit
  37. 01 Jun, 2022 1 commit