1. 27 Jul, 2022 1 commit
  2. 22 Jul, 2022 1 commit
    • Alara Dirik's avatar
      Add OWL-ViT model for zero-shot object detection (#17938) · 12d66b47
      Alara Dirik authored
      * add owlvit model skeleton
      
      * add class and box predictor heads
      
      * convert modified flax clip to pytorch
      
      * fix box and class predictors
      
      * add OwlViTImageTextEmbedder
      
      * convert class and box head checkpoints
      
      * convert image text embedder checkpoints
      
      * add object detection head
      
      * fix bugs
      
      * update conversion script
      
      * update conversion script
      
      * fix q,v,k,out weight conversion conversion
      
      * add owlvit object detection output
      
      * fix bug in image embedder
      
      * fix bugs in text embedder
      
      * fix positional embeddings
      
      * fix bug in inference mode vision pooling
      
      * update docs, init tokenizer and processor files
      
      * support batch processing
      
      * add OwlViTProcessor
      
      * remove merge conflicts
      
      * readd owlvit imports
      
      * fix bug in OwlViTProcessor imports
      
      * fix bugs in processor
      
      * update docs
      
      * fix bugs in processor
      
      * update owlvit docs
      
      * add OwlViTFeatureExtractor
      
      * style changes, add postprocess method to feature extractor
      
      * add feature extractor and processor tests
      
      * add object detection tests
      
      * update conversion script
      
      * update config paths
      
      * update config paths
      
      * fix configuration paths and bugs
      
      * fix bugs in OwlViT tests
      
      * add import checks to processor
      
      * fix docs and minor issues
      
      * fix docs and minor issues
      
      * fix bugs and issues
      
      * fix bugs and issues
      
      * fix bugs and issues
      
      * fix bugs and issues
      
      * update docs and examples
      
      * fix bugs and issues
      
      * update conversion script, fix positional embeddings
      
      * process 2D input ids, update tests
      
      * fix style and quality issues
      
      * update docs
      
      * update docs and imports
      
      * update OWL-ViT index.md
      
      * fix bug in OwlViT feature ext tests
      
      * fix code examples, return_dict by default
      
      * return_dict by default
      
      * minor fixes, add tests to processor
      
      * small fixes
      
      * add output_attentions arg to main model
      
      * fix bugs
      
      * remove output_hidden_states arg from main model
      
      * update self.config variables
      
      * add option to return last_hidden_states
      
      * fix bug in config variables
      
      * fix copied from statements
      
      * fix small issues and bugs
      
      * fix bugs
      
      * fix bugs, support greyscale images
      
      * run fixup
      
      * update repo name
      
      * merge OwlViTImageTextEmbedder with obj detection head
      
      * fix merge conflict
      
      * fix merge conflict
      
      * make fixup
      
      * fix bugs
      
      * fix bugs
      
      * add additional processor test
      12d66b47
  3. 19 Jul, 2022 2 commits
  4. 18 Jul, 2022 1 commit
  5. 29 Jun, 2022 2 commits
  6. 28 Jun, 2022 1 commit
  7. 24 Jun, 2022 1 commit
    • rooa's avatar
      Add CodeGen model (#17443) · d6b6fb99
      rooa authored
      
      
      * Add CodeGen model
      
      * Add missing key and switch order of super()
      
      * Fix torch.ones init with uint8 instead of bool
      
      * Address comments: copy statements and doc
      
      * update tests
      
      * remove old model parallel
      
      * fix batch gen tests
      
      * fix batch gen test
      
      * update test_gpt2_sample_max_time
      
      * fix codgen test and revert gpt2 test change
      
      * Fix incorrect tie_word_embedding value, typo, URL
      
      * Fix model order in README and styling
      
      * Reorder model list alphabetically
      
      * Set tie_word_embedding to False by default
      
      * Apply suggestions from code review
      
      * Better attn mask name & remove attn masked_bias
      
      * add tokenizer for codegen
      
      * quality
      
      * doc tokenizer
      
      * fix-copies
      
      * add CodeGenTokenizer in converter
      
      * make truncation optional
      
      * add test for truncation
      
      * add copyright
      
      * fix-copies
      
      * fix fast tokenizer decode
      
      * Update src/transformers/models/codegen/tokenization_codegen.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * increase vocab_size in tests
      Co-authored-by: default avatarpatil-suraj <surajp815@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      d6b6fb99
  8. 23 Jun, 2022 1 commit
  9. 21 Jun, 2022 1 commit
  10. 16 Jun, 2022 1 commit
  11. 15 Jun, 2022 1 commit
  12. 13 Jun, 2022 1 commit
    • Daniel Stancl's avatar
      Add `LongT5` model (#16792) · a72f1c9f
      Daniel Stancl authored
      
      
      * Initial commit
      
      * Make some fixes
      
      * Make PT model full forward pass
      
      * Drop TF & Flax implementation, fix copies etc
      
      * Add Flax model and update some corresponding stuff
      
      * Drop some TF things
      
      * Update config and flax local attn
      
      * Add encoder_attention_type to config
      
      * .
      
      * Update docs
      
      * Do some cleansing
      
      * Fix some issues -> make style; add some docs
      
      * Fix position_bias + mask addition + Update tests
      
      * Fix repo consistency
      
      * Fix model consistency by removing flax operation over attn_mask
      
      * [WIP] Add PT TGlobal LongT5
      
      * .
      
      * [WIP] Add flax tglobal model
      
      * [WIP] Update flax model to use the right attention type in the encoder
      
      * Fix flax tglobal model forward pass
      
      * Make the use of global_relative_attention_bias
      
      * Add test suites for TGlobal model
      
      * Fix minor bugs, clean code
      
      * Fix pt-flax equivalence though not convinced with correctness
      
      * Fix LocalAttn implementation to match the original impl. + update READMEs
      
      * Few updates
      
      * Update: [Flax] improve large model init and loading #16148
      
      * Add ckpt conversion script accoring to #16853 + handle torch device placement
      
      * Minor updates to conversion script.
      
      * Typo: AutoModelForSeq2SeqLM -> FlaxAutoModelForSeq2SeqLM
      
      * gpu support + dtype fix
      
      * Apply some suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * * Remove (de)parallelize stuff
      * Edit shape comments
      * Update README.md
      * make fix-copies
      
      * Remove caching logic for local & tglobal attention
      
      * Apply another batch of suggestions from code review
      
      * Add missing checkpoints
      * Format converting scripts
      * Drop (de)parallelize links from longT5 mdx
      
      * Fix converting script + revert config file change
      
      * Revert "Remove caching logic for local & tglobal attention"
      
      This reverts commit 2a619828f6ddc3e65bd9bb1725a12b77fa883a46.
      
      * Stash caching logic in Flax model
      
      * Make side relative bias used always
      
      * Drop caching logic in PT model
      
      * Return side bias as it was
      
      * Drop all remaining model parallel logic
      
      * Remove clamp statements
      
      * Move test files to the proper place
      
      * Update docs with new version of hf-doc-builder
      
      * Fix test imports
      
      * Make some minor improvements
      
      * Add missing checkpoints to docs
      * Make TGlobal model compatible with torch.onnx.export
      * Replace some np.ndarray with jnp.ndarray
      
      * Fix TGlobal for ONNX conversion + update docs
      
      * fix _make_global_fixed_block_ids and masked neg  value
      
      * update flax model
      
      * style and quality
      
      * fix imports
      
      * remove load_tf_weights_in_longt5 from init and fix copies
      
      * add slow test for TGlobal model
      
      * typo fix
      
      * Drop obsolete is_parallelizable and one warning
      
      * Update __init__ files to fix repo-consistency
      
      * fix pipeline test
      
      * Fix some device placements
      
      * [wip]: Update tests -- need to generate summaries to update expected_summary
      
      * Fix quality
      
      * Update LongT5 model card
      
      * Update (slow) summarization tests
      
      * make style
      
      * rename checkpoitns
      
      * finish
      
      * fix flax tests
      Co-authored-by: default avatarphungvanduy <pvduy23@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarpatil-suraj <surajp815@gmail.com>
      a72f1c9f
  13. 09 Jun, 2022 1 commit
  14. 07 Jun, 2022 1 commit
    • Chan Woo Kim's avatar
      M-CTC-T Model (#16402) · 119e3c0f
      Chan Woo Kim authored
      
      
      * added cbs to notebooks, made copy-paste error fix in generation_utils
      
      * initial push for mctc model
      
      * mctc feature extractor done
      
      * added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly.
      
      * added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly.
      
      * passing attention, now struggling to figure out how attention masks make sense here
      
      * works when excluding attention masks. ask later how one would integrate attention maskshere
      
      * bizarre configuration error (model prefix comes first in config dict json and messes up the order)
      
      * all passing but bizzarre config dict ordering issue when to_dict
      
      * passing all major tests
      
      * feature extraction, processor, tokenizer added & tests passing
      
      * style & consistency & other logistical fixes
      
      * copy paste fix
      
      * model after feature extraction working
      
      * commiting final feature extraction results; need to fix normalization
      
      * feature extraction passing tests; probably should add tests on the specific flashlight-copied functions?
      
      * delete print ; format code a bit
      
      * fixing tests
      
      * passing major tests
      
      * fixing styles
      
      * completed tokenization test with real example; not sure if these values are entirely correct.
      
      * last test fixes from local
      
      * reverting accidentally included custom setup configs
      
      * remove load tf weights; fix config error
      
      * testing couldnt import featureextractor
      
      * fix docs
      
      * fix docs
      
      * resolving comments
      
      * style fixes
      
      * style fixes
      
      * Update to MCTCConv1dSubSampler
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * relposemb fixes
      
      * conv1d name issue; expecting config fail with paraentheses
      
      * fix config issue
      
      * fix config issue
      
      * fix config issue
      
      * change everything to MCTCT
      
      * fixing naming change errors
      
      * archive list
      
      * copyrights and docs
      
      * copyrights and docs
      
      * copyrights and docs
      
      * merge resolution
      
      * move tests, fix to changed optionaldependency structure
      
      * test directories changed
      
      * fixing tests
      
      * how to avoid tf tests?
      
      * how to avoid tf tests?
      
      * tests passing locally
      
      * allow mctctprocessor imported any env
      
      * allow mctctprocessor imported any env
      
      * fixed second round of feedback, need to fix docs
      
      * doc changes not being applied
      
      * all fixed
      
      * style fix
      
      * feedback fixes
      
      * fix copies and feature extraction style fix
      
      * Update tests/models/visual_bert/test_modeling_visual_bert.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * copy paste huggingface:main visual bert
      
      * added eof newline to visual bert; all tests are passing otherwise
      
      * fix slow tests by adding attention mask
      
      * change model id to speechbrain
      
      * make fix-copies
      
      * fix readme unwanted deletes
      
      * fixing readmes, make fix-copies
      
      * consistent M-CTC-T naming
      
      * Update src/transformers/models/mctct/__init__.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * all fixed but variable naming
      
      * adjust double quotes
      
      * fixed variable names
      
      * copyright and mr quilter
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * correct slow tests
      
      * make fix-copies
      
      * Update src/transformers/models/mctct/configuration_mctct.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/mctct/configuration_mctct.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * m-ctc-t not mctct
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      119e3c0f
  15. 02 Jun, 2022 1 commit
  16. 01 Jun, 2022 1 commit
  17. 24 May, 2022 2 commits
    • Jason Phang's avatar
      [WIP] Adding GPT-NeoX-20B (#16659) · 71e60272
      Jason Phang authored
      
      
      * initial
      
      * first try
      
      * working 20B
      
      * 20B tokenizers
      
      * Docs
      
      * Import fixes for missing classes
      
      * Update docs, fixup
      
      * black formatting
      
      * isort
      
      * flake
      
      * dummy objects
      
      * documentation
      
      * Documentation yml
      
      * more docs
      
      * tweaks for tests
      
      * tokenization auto
      
      * fix neox tests
      
      * test
      
      * test
      
      * einsum
      
      * address PR feedback
      
      * Documentation
      
      * Update README.md
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/gpt_neox/__init__.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/gpt_neox/configuration_gpt_neox.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Remove undefined LaTeX syntax
      
      * Update to full url to avoid confusion about if that's supposed to refer to the Hub
      
      * fix auto
      
      * move tests
      
      * documentation fix
      
      * more doc fixes
      
      * test refactor
      
      * fix import
      
      * fix import
      
      * fix import
      
      * fix import
      
      * fix import
      
      * style fixes
      
      * More modeling fixes
      Co-authored-by: default avatarJason Phang <zp489@gr057.hpc.nyu.edu>
      Co-authored-by: default avatarStella Biderman <stellabiderman@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      71e60272
    • NielsRogge's avatar
      Add LayoutLMv3 (#17060) · 31ee80d5
      NielsRogge authored
      
      
      * Make forward pass work
      
      * More improvements
      
      * Remove unused imports
      
      * Remove timm dependency
      
      * Improve loss calculation of token classifier
      
      * Fix most tests
      
      * Add docs
      
      * Add model integration test
      
      * Make all tests pass
      
      * Add LayoutLMv3FeatureExtractor
      
      * Improve integration test + make fixup
      
      * Add example script
      
      * Fix style
      
      * Add LayoutLMv3Processor
      
      * Fix style
      
      * Add option to add visual labels
      
      * Make more tokenizer tests pass
      
      * Fix more tests
      
      * Make more tests pass
      
      * Fix bug and improve docs
      
      * Fix import of processors
      
      * Improve docstrings
      
      * Fix toctree and improve docs
      
      * Fix auto tokenizer
      
      * Move tests to model folder
      
      * Move tests to model folder
      
      * change default behavior add_prefix_space
      
      * add prefix space for fast
      
      * add_prefix_spcae set to True for Fast
      
      * no space before `unique_no_split` token
      
      * add test to hightligh special treatment of added tokens
      
      * fix `test_batch_encode_dynamic_overflowing` by building a long enough example
      
      * fix `test_full_tokenizer` with add_prefix_token
      
      * Fix tokenizer integration test
      
      * Make the code more readable
      
      * Add tests for LayoutLMv3Processor
      
      * Fix style
      
      * Add model to README and update init
      
      * Apply suggestions from code review
      
      * Replace asserts by value errors
      
      * Add suggestion by @ducviet00
      
      * Add model to doc tests
      
      * Simplify script
      
      * Improve README
      
      * a step ahead to fix
      
      * Update pair_input_test
      
      * Make all tokenizer tests pass - phew
      
      * Make style
      
      * Add LayoutLMv3 to CI job
      
      * Fix auto mapping
      
      * Fix CI job name
      
      * Make all processor tests pass
      
      * Make tests of LayoutLMv2 and LayoutXLM consistent
      
      * Add copied from statements to fast tokenizer
      
      * Add copied from statements to slow tokenizer
      
      * Remove add_visual_labels attribute
      
      * Fix tests
      
      * Add link to notebooks
      
      * Improve docs of LayoutLMv3Processor
      
      * Fix reference to section
      Co-authored-by: default avatarSaulLu <lucilesaul.com@gmail.com>
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      31ee80d5
  18. 23 May, 2022 1 commit
  19. 18 May, 2022 1 commit
  20. 17 May, 2022 1 commit
  21. 12 May, 2022 1 commit
  22. 11 May, 2022 1 commit
    • Amanpreet Singh's avatar
      [feat] Add FLAVA model (#16654) · a10f6183
      Amanpreet Singh authored
      * [WIP] Add FLAVA model
      
      This PR aims to add [FLAVA](ihttps://arxiv.org/abs/2112.04482) model to the transformers repo.
      
      Following checklist delineates the list of things to be done for this PR
      to be complete:
      
      [x] Flava init
      [x] Flava base models
      [x] Flava layers
      [x] Flava Configs
      [x] Flava encoders
      [x] Flava pretraining models
      [ ] Flava classification/retrieval models (To be added in a separate PR)
      [x] Documentation updates 
      [x] Imports updates 
      [x] Argstring updates
      [x] Flava pretrained checkpoints 
      [x] Flava tests
      [x] Flava processors 
      [x] Sanity check
      [x] Lint
      a10f6183
  23. 02 May, 2022 1 commit
    • NielsRogge's avatar
      Add YOLOS (#16848) · 1ac69874
      NielsRogge authored
      
      
      * First draft
      
      * Add YolosForObjectDetection
      
      * Make forward pass work
      
      * Add mid position embeddings
      
      * Add interpolation of position encodings
      
      * Add expected values
      
      * Add YOLOS to tests
      
      * Add integration test
      
      * Support tiny model as well
      
      * Support all models in conversion script
      
      * Remove mid_pe_size attribute
      
      * Make more tests pass
      
      * Add model to README and fix config
      
      * Add copied from statements
      
      * Rename base_model_prefix to vit
      
      * Add missing YOLOS_PRETRAINED_CONFIG_ARCHIVE_MAP
      
      * Apply suggestions from code review
      
      * Apply more suggestions from code review
      
      * Convert remaining checkpoints
      
      * Improve docstrings
      
      * Add YolosFeatureExtractor
      
      * Add feature extractor to docs
      
      * Add corresponding tests
      
      * Fix style
      
      * Fix docs
      
      * Apply suggestion from code review
      
      * Fix bad rebase
      
      * Fix some more bad rebase
      
      * Fix missing character
      
      * Improve docs and variable names
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      1ac69874
  24. 28 Apr, 2022 1 commit
  25. 08 Apr, 2022 1 commit
    • NielsRogge's avatar
      Add TAPEX (#16473) · 4ef0abb7
      NielsRogge authored
      
      
      * Add TapexTokenizer
      
      * Improve docstrings and provide option to provide answer
      
      * Remove option for pretokenized inputs
      
      * Add TAPEX to README
      
      * Fix copies
      
      * Remove option for pretokenized inputs
      
      * Initial commit: add tapex fine-tuning examples on both table-based question answering and table-based fact verification.
      
      * - Draft a README file for running the script and introducing some background.
      - Remove unused code lines in tabfact script.
      - Disable the deafult `pad_to_max_length` option which is memory-consuming.
      
      * * Support `as_target_tokenizer` function for TapexTokenizer.
      * Fix the do_lower_case behaviour of TapexTokenizer.
      * Add unit tests for target scenarios and cased/uncased scenarios for both source and target.
      
      * * Replace the label BartTokenizer with TapexTokenizer's as_target_tokenizer function.
      * Fix typos in tapex example README.
      
      * * fix the evaluation script - remove the property `task_name`
      
      * * Make the label space more clear for tabfact tasks
      
      * * Using a new fine-tuning script for tapex-base on tabfact.
      
      * * Remove the lowercase code outside the tokenizer - we use the tokenizer to control whether do_lower_case
      * Guarantee the hyper-parameter can be run without out-of-memory on 16GB card and report the new reproduced number on wikisql
      
      * * Remove the default tokenizer_name option.
      * Provide evaluation command.
      
      * * Support for WikiTableQuestion dataset.
      
      * Fix a typo in README.
      
      * * Fix the datasets's key name in WikiTableQuestions
      
      * Run make fixup and move test to folder
      
      * Fix quality
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Apply some more suggestions from code review
      
      * Improve docstrings
      
      * Overwrite failing test
      
      * Improve comment in example scripts
      
      * Fix rebase
      
      * Add TAPEX to Auto mapping
      
      * Add TAPEX to auto config mappings
      
      * Put TAPEX higher than BART in auto mapping
      
      * Add TAPEX to doc tests
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MBP.localdomain>
      Co-authored-by: default avatarSivilTaram <qianlxc@outlook.com>
      Co-authored-by: default avatarNiels Rogge <nielsrogge@nielss-mbp.home>
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      4ef0abb7
  26. 07 Apr, 2022 2 commits
  27. 28 Mar, 2022 1 commit
    • NielsRogge's avatar
      Add DPT (#15991) · 979b039c
      NielsRogge authored
      
      
      * First draft
      
      * More improvements
      
      * Add fusion blocks
      
      * Make conversion script work for dpt_large
      
      * Make conversion script work
      
      * Improve implementation
      
      * Improve conversion script
      
      * Add DPTForSemanticSegmentation
      
      * Make conversion work for semantic segmentation
      
      * Add tests
      
      * Remove print statements
      
      * First draft
      
      * Redesign neck
      
      * Improve tests
      
      * Improve implementation some more
      
      * Make neck output list of tensors
      
      * Improve neck and feature extractor
      
      * Fix integration tests
      
      * Make more tests pass
      
      * Make all tests pass
      
      * Add missing config archive map
      
      * Add in_index attribute to make heads accept list of tensors
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Apply some more suggestions
      
      * Add copied from statements
      
      * Remove assert
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      
      * Remove DPTInterpolate in favor of nn.Upsample
      
      * Add comments
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      
      * Add proposed design
      
      * Update design
      
      * Add DPTReassembleLayer
      
      * Add DPTFeatureFusionStage
      
      * Apply more suggestions from code review
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      
      * Fix rebase
      
      * Update in_index and out_indices
      
      * Fix conversion script
      
      * Fix code quality
      
      * Add model to toctree and use DepthEstimatorOutput
      
      * Fix rebase
      
      * Fix code examples
      
      * Improve code
      
      * Fix copied from statements
      
      * Apply suggestions from code review
      
      * Remove compute_loss method
      
      * Apply suggestions from code review
      
      * Fix documentation tests file
      
      * Remove test.py file
      
      * Improve doc example
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarNiels Rogge <nielsrogge@nielss-mbp.home>
      979b039c
  28. 24 Mar, 2022 1 commit
  29. 23 Mar, 2022 2 commits
    • Edward Beeching's avatar
      Decision transformer gym (#15845) · aff9bc40
      Edward Beeching authored
      
      
      * Created the Decision Transformer Modle
      
      * updating tests, copy to other machine
      
      * Added last hidden size to Decision Transformer modelling outputs
      
      * Removed copy of original DT file
      
      * made a temporary change to gpt2 to have it conform with the Decision Transformer version
      
      * Updated tests
      
      * Ignoring a file used to test the DT model
      
      * added comments to config file
      
      * added comments and argument descriptions to decision transformer file
      
      * Updated doc
      
      * Ran "make style"
      
      * Remove old model imports
      
      * Removed unused imports, cleaned up init file
      
      * Update docs/source/model_doc/decision_transformer.mdx
      
      added my username
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Reverted changes made to gpt2
      
      * Removed datasets submodule
      
      * Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states
      
      * Added support for return of hidden states, attentions and return dict of gpt2 model.
      
      * Updated tests to include many of the ModelTesterMixin tests. 
      
      The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes
      
      * Added missing line to the end of gpt2 file
      
      * Added an integration test for the Decision Transformer
      
      Test performs and autoregressive evaluation for two time steps
      
      * Set done and info to _ to fix failing test
      
      * Updated integration test to be deterministic and check expected outputs
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Removed unnecessary config options
      
      * Cleaned up commented code and old comments.
      
      * Cleaned up commented code.
      
      * Changed DecisionTransformer to Decision Transformer
      
      * Added Decision Transformer to the main README file
      
      * Added copy of GTP2 called DecisionTranformerGPT2Model
      
      * isorted imports
      
      * isorted imports
      
      * Added model to non-English README files
      
      * Ran make fix-copies and corrected some cases.
      
      * Updated index file to include Decision Transformer
      
      * Added gpt2 model as copy inside the Decision Transformer model file
      
      * Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS
      
      * Deleted redundant checkpoint files (I don't know how these got committed)
      
      * Removed testing files. (These should have never been committed)
      
      * Removed accidentally committed files
      
      * Moved the Decision Transformer test to its own directory
      
      * Add type hints for Pegasus (#16324)
      
      * Funnel type hints (#16323)
      
      * add pt funnel type hints
      
      * add tf funnel type hints
      
      * Add type hints for ProphetNet PyTorch (#16272)
      
      * [GLPN] Improve docs (#16331)
      
      * Add link to notebook
      
      * Add link
      
      * Fix bug
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      
      * Added type hints for Pytorch Marian calls (#16200)
      
      * Added type hinting for forward functions in pytorch marian
      
      * typo correction
      
      * Removed type hints on functions from BART per Suraj Patil request
      
      * fix import pb
      
      * fix typo
      
      * corrected tuple call
      
      * ran black
      
      * after fix-copies
      Some optional tags on primitives were removed, past_key_values in MarianForCausalLM changed from Tuple of Tuple to List
      
      * Fixing copies to roformer and pegasus
      Co-authored-by: default avatarClementine Fourrier <cfourrie@inria.fr>
      Co-authored-by: default avatarmatt <rocketknight1@gmail.com>
      
      * Moved DecisionTransformOutput to modeling_decision_transformer
      
      * Moved the example usage to research project and cleaned comments
      
      * Made tests ignore the copy of gpt2 in Decision Transformer
      
      * Added module output to modelling decision transformer
      
      * removed copied gpt2 model from list of transformers models
      
      * Updated tests and created __init__ file for new test location
      
      * Update README.md
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Removed unneeded summary type from config file
      
      * Fixed copies
      
      * Updated pretrained config map to refer to hopper-medium checkpoint
      
      * done (#16340)
      
      * Added Decision transformer to model docs
      
      * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Add type annotations for Rembert/Splinter and copies (#16338)
      
      * undo black autoformat
      
      * minor fix to rembert forward with default
      
      * make fix-copies, make quality
      
      * Adding types to template model
      
      * Removing List from the template types
      
      * Remove `Optional` from a couple of types that don't accept `None`
      Co-authored-by: default avatarmatt <rocketknight1@gmail.com>
      
      * [Bug template] Shift responsibilities for long-range (#16344)
      
      * Fix code repetition in serialization guide (#16346)
      
      * Adopt framework-specific blocks for content (#16342)
      
      *  refactor code samples with framework-specific blocks
      
      *  update training.mdx
      
      * 🖍
      
       apply feedback
      
      * Updates the default branch from master to main (#16326)
      
      * Updates the default branch from master to main
      
      * Links from `master` to `main`
      
      * Typo
      
      * Update examples/flax/README.md
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Updated model with custom docstring example
      
      * Created the Decision Transformer Modle
      
      * updating tests, copy to other machine
      
      * Added last hidden size to Decision Transformer modelling outputs
      
      * Removed copy of original DT file
      
      * made a temporary change to gpt2 to have it conform with the Decision Transformer version
      
      * Updated tests
      
      * Ignoring a file used to test the DT model
      
      * added comments to config file
      
      * added comments and argument descriptions to decision transformer file
      
      * Updated doc
      
      * Ran "make style"
      
      * Remove old model imports
      
      * Removed unused imports, cleaned up init file
      
      * Update docs/source/model_doc/decision_transformer.mdx
      
      added my username
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Reverted changes made to gpt2
      
      * Removed datasets submodule
      
      * Update the modeling outputs to include gpt2 attentions, hidden states and last hidden states
      
      * Added support for return of hidden states, attentions and return dict of gpt2 model.
      
      * Updated tests to include many of the ModelTesterMixin tests. 
      
      The following tests are skipped: test_generate_without_input_ids, test_pruning, test_resize_embeddings, test_head_masking, test_attention_outputs, test_hidden_states_output, test_inputs_embeds, test_model_common_attributes
      
      * Added missing line to the end of gpt2 file
      
      * Added an integration test for the Decision Transformer
      
      Test performs and autoregressive evaluation for two time steps
      
      * Set done and info to _ to fix failing test
      
      * Updated integration test to be deterministic and check expected outputs
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Removed unnecessary config options
      
      * Cleaned up commented code and old comments.
      
      * Cleaned up commented code.
      
      * Changed DecisionTransformer to Decision Transformer
      
      * Added Decision Transformer to the main README file
      
      * Added copy of GTP2 called DecisionTranformerGPT2Model
      
      * isorted imports
      
      * isorted imports
      
      * Added model to non-English README files
      
      * Ran make fix-copies and corrected some cases.
      
      * Updated index file to include Decision Transformer
      
      * Added gpt2 model as copy inside the Decision Transformer model file
      
      * Added the unit test file to the list of TEST_FILES_WITH_NO_COMMON_TESTS
      
      * Deleted redundant checkpoint files (I don't know how these got committed)
      
      * Removed testing files. (These should have never been committed)
      
      * Removed accidentally committed files
      
      * Moved the Decision Transformer test to its own directory
      
      * Moved DecisionTransformOutput to modeling_decision_transformer
      
      * Moved the example usage to research project and cleaned comments
      
      * Made tests ignore the copy of gpt2 in Decision Transformer
      
      * Added module output to modelling decision transformer
      
      * removed copied gpt2 model from list of transformers models
      
      * Updated tests and created __init__ file for new test location
      
      * Update README.md
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Removed unneeded summary type from config file
      
      * Fixed copies
      
      * Updated pretrained config map to refer to hopper-medium checkpoint
      
      * Added Decision transformer to model docs
      
      * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/decision_transformer/modeling_decision_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/decision_transformer/configuration_decision_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Updated model with custom docstring example
      
      * Updated copies, config auto, and readme files.
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarDan Tegzes <48134725+Tegzes@users.noreply.github.com>
      Co-authored-by: default avatarAdam Montgomerie <adam@avanssion.com>
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      Co-authored-by: default avatarClémentine Fourrier <22726840+clefourrier@users.noreply.github.com>
      Co-authored-by: default avatarClementine Fourrier <cfourrie@inria.fr>
      Co-authored-by: default avatarmatt <rocketknight1@gmail.com>
      Co-authored-by: default avatarFrancesco Saverio Zuppichini <francesco.zuppichini@gmail.com>
      Co-authored-by: default avatarJacob Dineen <54680234+jacobdineen@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarOmar Sanseviero <osanseviero@gmail.com>
      Co-authored-by: default avatarSteven Liu <59462357+stevhliu@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <lysandre.debut@reseau.eseo.fr>
      aff9bc40
    • Lysandre Debut's avatar
      Updates the default branch from master to main (#16326) · eca77f47
      Lysandre Debut authored
      
      
      * Updates the default branch from master to main
      
      * Links from `master` to `main`
      
      * Typo
      
      * Update examples/flax/README.md
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      eca77f47
  30. 22 Mar, 2022 1 commit
    • NielsRogge's avatar
      Add GLPN (#16199) · 0c55d47c
      NielsRogge authored
      
      
      * First draft
      
      * Fix logits calculation
      
      * Improve tests
      
      * Add copied from statements
      
      * Fix base_model_prefix
      
      * Improve implementation, upload new models
      
      * Update design
      
      * Fix integration test
      
      * Add model to README and toctree
      
      * Add document image
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Add decoder_hidden_size attribute
      
      * Update design of decoder
      
      * Add DepthEstimatorOutput class
      
      * Rename in_index to head_in_index and add feature extractor tests
      
      * Apply suggestions from code review
      
      * Apply suggestions from code review
      
      * Update pretrained model name and add to doc tests
      
      * Remove test.py script
      
      * Update copied from statements and clean up
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      0c55d47c
  31. 15 Mar, 2022 1 commit
  32. 14 Mar, 2022 1 commit
  33. 10 Mar, 2022 2 commits
  34. 02 Mar, 2022 1 commit
    • Francesco Saverio Zuppichini's avatar
      Maskformer (#15682) · d83d22f5
      Francesco Saverio Zuppichini authored
      
      
      * maskformer
      
      * conflicts
      
      * conflicts
      
      * minor fixes
      
      * feature extractor test fix
      
      refactor MaskFormerLoss following conversation
      
      MaskFormer related types should not trigger a module time import error
      
      missed one
      
      removed all the types that are not used
      
      update config mapping
      
      minor updates in the doc
      
      resolved conversation that doesn't need a discussion
      
      minor changes
      
      resolved conversations
      
      fixed DetrDecoder
      
      * minor changes
      
      minor changes
      
      fixed mdx file
      
      test feature_extractor return types
      
      functional losses -> classes
      
      removed the return type test for the feature extractor
      
      minor changes + style + quality
      
      * conflicts?
      
      * rebase master
      
      * readme
      
      * added missing files
      
      * deleded poolformers test that where in the wrong palce
      
      * CI
      
      * minor changes
      
      * Apply suggestions from code review
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * resolved conversations
      
      * minor changes
      
      * conversations
      
      [Unispeech] Fix slow tests (#15818)
      
      * remove soundfile old way of loading audio
      
      * Adapt slow test
      
      [Barthez Tokenizer] Fix saving (#15815)
      
      [TFXLNet] Correct tf xlnet generate (#15822)
      
      * [TFXLNet] Correct tf xlnet
      
      * adapt test comment
      
      Fix the push run (#15807)
      
      Fix semantic segmentation pipeline test (#15826)
      
      Fix dummy_inputs() to dummy_inputs in symbolic_trace doc (#15776)
      
      Add model specific output classes to PoolFormer model docs (#15746)
      
      * Added model specific output classes to poolformer docs
      
      * Fixed Segformer typo in Poolformer docs
      
      Adding the option to return_timestamps on pure CTC ASR models. (#15792)
      
      * Adding the option to return_timestamps on pure CTC ASR models.
      
      * Remove `math.prod` which was introduced in Python 3.8
      
      * int are not floats.
      
      * Reworking the PR to support "char" vs "word" output.
      
      * Fixup!
      
      * Update src/transformers/pipelines/automatic_speech_recognition.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/pipelines/automatic_speech_recognition.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/pipelines/automatic_speech_recognition.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/pipelines/automatic_speech_recognition.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/pipelines/automatic_speech_recognition.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/pipelines/automatic_speech_recognition.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/pipelines/automatic_speech_recognition.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/pipelines/automatic_speech_recognition.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/pipelines/automatic_speech_recognition.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Quality.
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      HFTracer.trace should use/return self.graph to be compatible with torch.fx.Tracer (#15824)
      
      Fix tf.concatenate + test past_key_values for TF models (#15774)
      
      * fix wrong method name tf.concatenate
      
      * add tests related to causal LM / decoder
      
      * make style and quality
      
      * clean-up
      
      * Fix TFBertModel's extended_attention_mask when past_key_values is provided
      
      * Fix tests
      
      * fix copies
      
      * More tf.int8 -> tf.int32 in TF test template
      
      * clean-up
      
      * Update TF test template
      
      * revert the previous commit + update the TF test template
      
      * Fix TF template extended_attention_mask when past_key_values is provided
      
      * Fix some styles manually
      
      * clean-up
      
      * Fix ValueError: too many values to unpack in the test
      
      * Fix more: too many values to unpack in the test
      
      * Add a comment for extended_attention_mask when there is past_key_values
      
      * Fix TFElectra extended_attention_mask when past_key_values is provided
      
      * Add tests to other TF models
      
      * Fix for TF Electra test: add prepare_config_and_inputs_for_decoder
      
      * Fix not passing training arg to lm_head in TFRobertaForCausalLM
      
      * Fix tests (with past) for TF Roberta
      
      * add testing for pask_key_values for TFElectra model
      Co-authored-by: default avatarydshieh <ydshieh@users.noreply.github.com>
      
      [examples/summarization and translation] fix readme (#15833)
      
      Add ONNX Runtime quantization for text classification notebook (#15817)
      
      Re-enable doctests for the quicktour (#15828)
      
      * Re-enable doctests for the quicktour
      
      * Re-enable doctests for task_summary (#15830)
      
      * Remove &
      
      Framework split model report (#15825)
      
      Add TFConvNextModel (#15750)
      
      * feat: initial implementation of convnext in tensorflow.
      
      * fix: sample code for the classification model.
      
      * chore: added checked for  from the classification model.
      
      * chore: set bias initializer in the classification head.
      
      * chore: updated license terms.
      
      * chore: removed ununsed imports
      
      * feat: enabled  argument during using drop_path.
      
      * chore: replaced tf.identity with layers.Activation(linear).
      
      * chore: edited default checkpoint.
      
      * fix: minor bugs in the initializations.
      
      * partial-fix: tf model errors for loading pretrained pt weights.
      
      * partial-fix: call method updated
      
      * partial-fix: cross loading of weights (4x3 variables to be matched)
      
      * chore: removed unneeded comment.
      
      * removed playground.py
      
      * rebasing
      
      * rebasing and removing playground.py.
      
      * fix: renaming TFConvNextStage conv and layer norm layers
      
      * chore: added initializers and other minor additions.
      
      * chore: added initializers and other minor additions.
      
      * add: tests for convnext.
      
      * fix: integration tester class.
      
      * fix: issues mentioned in pr feedback (round 1).
      
      * fix: how output_hidden_states arg is propoagated inside the network.
      
      * feat: handling of  arg for pure cnn models.
      
      * chore: added a note on equal contribution in model docs.
      
      * rebasing
      
      * rebasing and removing playground.py.
      
      * feat: encapsulation for the convnext trunk.
      
      * Fix variable naming; Test-related corrections; Run make fixup
      
      * chore: added Joao as a contributor to convnext.
      
      * rebasing
      
      * rebasing and removing playground.py.
      
      * rebasing
      
      * rebasing and removing playground.py.
      
      * chore: corrected copyright year and added comment on NHWC.
      
      * chore: fixed the black version and ran formatting.
      
      * chore: ran make style.
      
      * chore: removed from_pt argument from test, ran make style.
      
      * rebasing
      
      * rebasing and removing playground.py.
      
      * rebasing
      
      * rebasing and removing playground.py.
      
      * fix: tests in the convnext subclass, ran make style.
      
      * rebasing
      
      * rebasing and removing playground.py.
      
      * rebasing
      
      * rebasing and removing playground.py.
      
      * chore: moved convnext test to the correct location
      
      * fix: locations for the test file of convnext.
      
      * fix: convnext tests.
      
      * chore: applied  sgugger's suggestion for dealing w/ output_attentions.
      
      * chore: added comments.
      
      * chore: applied updated quality enviornment style.
      
      * chore: applied formatting with quality enviornment.
      
      * chore: revert to the previous tests/test_modeling_common.py.
      
      * chore: revert to the original test_modeling_common.py
      
      * chore: revert to previous states for test_modeling_tf_common.py and modeling_tf_utils.py
      
      * fix: tests for convnext.
      
      * chore: removed output_attentions argument from convnext config.
      
      * chore: revert to the earlier tf utils.
      
      * fix: output shapes of the hidden states
      
      * chore: removed unnecessary comment
      
      * chore: reverting to the right test_modeling_tf_common.py.
      
      * Styling nits
      Co-authored-by: default avatarariG23498 <aritra.born2fly@gmail.com>
      Co-authored-by: default avatarJoao Gante <joao@huggingface.co>
      Co-authored-by: default avatarSylvain Gugger <Sylvain.gugger@gmail.com>
      
      * minor changes
      
      * doc fix in feature extractor
      
      * doc
      
      * typose
      
      * removed detr logic from config
      
      * removed detr logic from config
      
      * removed num_labels
      
      * small fix in the config
      
      * auxilary -> auxiliary
      
      * make style
      
      * some test is failing
      
      * fix a weird char in config prevending doc-builder
      
      * retry to fix the doc-builder issue
      
      * make style
      
      * new try to fix the doc builder
      
      * CI
      
      * change weights to facebook
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      Co-authored-by: default avatarariG23498 <aritra.born2fly@gmail.com>
      Co-authored-by: default avatarJoao Gante <joao@huggingface.co>
      Co-authored-by: default avatarSylvain Gugger <Sylvain.gugger@gmail.com>
      d83d22f5