1. 04 Mar, 2024 1 commit
    • NielsRogge's avatar
      Add UDOP (#22940) · 836921fd
      NielsRogge authored
      
      
      * First draft
      
      * More improvements
      
      * More improvements
      
      * More fixes
      
      * Fix copies
      
      * More improvements
      
      * More fixes
      
      * More improvements
      
      * Convert checkpoint
      
      * More improvements, set up tests
      
      * Fix more tests
      
      * Add UdopModel
      
      * More improvements
      
      * Fix equivalence test
      
      * More fixes
      
      * Redesign model
      
      * Extend conversion script
      
      * Use real inputs for conversion script
      
      * Add image processor
      
      * Improve conversion script
      
      * Add UdopTokenizer
      
      * Add fast tokenizer
      
      * Add converter
      
      * Update README's
      
      * Add processor
      
      * Add fully fledged tokenizer
      
      * Add fast tokenizer
      
      * Use processor in conversion script
      
      * Add tokenizer tests
      
      * Fix one more test
      
      * Fix more tests
      
      * Fix tokenizer tests
      
      * Enable fast tokenizer tests
      
      * Fix more tests
      
      * Fix additional_special_tokens of fast tokenizer
      
      * Fix tokenizer tests
      
      * Fix more tests
      
      * Fix equivalence test
      
      * Rename image to pixel_values
      
      * Rename seg_data to bbox
      
      * More renamings
      
      * Remove vis_special_token
      
      * More improvements
      
      * Add docs
      
      * Fix copied from
      
      * Update slow tokenizer
      
      * Update fast tokenizer design
      
      * Make text input optional
      
      * Add first draft of processor tests
      
      * Fix more processor tests
      
      * Fix decoder_start_token_id
      
      * Fix test_initialization
      
      * Add integration test
      
      * More improvements
      
      * Improve processor, add test
      
      * Add more copied from
      
      * Add more copied from
      
      * Add more copied from
      
      * Add more copied from
      
      * Remove print statement
      
      * Update README and auto mapping
      
      * Delete files
      
      * Delete another file
      
      * Remove code
      
      * Fix test
      
      * Fix docs
      
      * Remove asserts
      
      * Add doc tests
      
      * Include UDOP in exotic model tests
      
      * Add expected tesseract decodings
      
      * Add sentencepiece
      
      * Use same design as T5
      
      * Add UdopEncoderModel
      
      * Add UdopEncoderModel to tests
      
      * More fixes
      
      * Fix fast tokenizer
      
      * Fix one more test
      
      * Remove parallelisable attribute
      
      * Fix copies
      
      * Remove legacy file
      
      * Copy from T5Tokenizer
      
      * Fix rebase
      
      * More fixes, copy from T5
      
      * More fixes
      
      * Fix init
      
      * Use ArthurZ/udop for tests
      
      * Make all model tests pass
      
      * Remove UdopForConditionalGeneration from auto mapping
      
      * Fix more tests
      
      * fixups
      
      * more fixups
      
      * fix the tokenizers
      
      * remove un-necessary changes
      
      * nits
      
      * nits
      
      * replace truncate_sequences_boxes with truncate_sequences for fix-copies
      
      * nit current path
      
      * add a test for input ids
      
      * ids that we should get taken from c9f7a32f57440d90ff79890270d376a1cc0acb68
      
      * nits converting
      
      * nits
      
      * apply ruff
      
      * nits
      
      * nits
      
      * style
      
      * fix slow order of addition
      
      * fix udop fast range as well
      
      * fixup
      
      * nits
      
      * Add docstrings
      
      * Fix gradient checkpointing
      
      * Update code examples
      
      * Skip tests
      
      * Update integration test
      
      * Address comment
      
      * Make fixup
      
      * Remove extra ids from tokenizer
      
      * Skip test
      
      * Apply suggestions from code review
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update year
      
      * Address comment
      
      * Address more comments
      
      * Address comments
      
      * Add copied from
      
      * Update CI
      
      * Rename script
      
      * Update model id
      
      * Add AddedToken, skip tests
      
      * Update CI
      
      * Fix doc tests
      
      * Do not use Tesseract for the doc tests
      
      * Remove kwargs
      
      * Add original inputs
      
      * Update casting
      
      * Fix doc test
      
      * Update question
      
      * Update question
      
      * Use LayoutLMv3ImageProcessor
      
      * Update organization
      
      * Improve docs
      
      * Update forward signature
      
      * Make images optional
      
      * Remove deprecated device argument
      
      * Add comment, add add_prefix_space
      
      * More improvements
      
      * Remove kwargs
      
      ---------
      Co-authored-by: default avatarArthurZucker <arthur.zucker@gmail.com>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      836921fd
  2. 28 Feb, 2024 1 commit
  3. 26 Feb, 2024 2 commits
  4. 21 Feb, 2024 2 commits
  5. 16 Feb, 2024 1 commit
  6. 14 Feb, 2024 1 commit
    • Jonathan Tow's avatar
      Add `StableLM` (#28810) · de6029a0
      Jonathan Tow authored
      * Add `StableLM`
      
      * fix(model): re-create from `huggingface-cli add-new-model-like persimmon`
      
      * fix: re-add changes to address comments
      
      * fix(readme): add links to paper
      
      * fix(tokenization_auto): remove `GPTNeoXTokenizerFastFast` ref
      
      * fix(tests): re-add `@slow` decorator to integration tests
      
      * fix(tests): import slow...
      
      * fix(readme_hd): remove whitespace edit
      
      * fix(tokenizer): auto tokenizer tuple
      
      * skip doctests for `modeling_stablelm`
      de6029a0
  7. 12 Feb, 2024 1 commit
  8. 09 Feb, 2024 1 commit
  9. 06 Feb, 2024 1 commit
    • Klaus Hipp's avatar
      [Docs] Add missing language options and fix broken links (#28852) · 1c31b7aa
      Klaus Hipp authored
      * Add missing entries to the language selector
      
      * Add links to the Colab and AWS Studio notebooks for ONNX
      
      * Use anchor links in CONTRIBUTING.md
      
      * Fix broken hyperlinks due to spaces
      
      * Fix links to OpenAI research articles
      
      * Remove confusing footnote symbols from author names, as they are also considered invalid markup
      1c31b7aa
  10. 29 Jan, 2024 1 commit
  11. 25 Jan, 2024 1 commit
    • NielsRogge's avatar
      Add Depth Anything (#28654) · 963db81a
      NielsRogge authored
      * First draft
      
      * More improvements
      
      * More improvements
      
      * More improvements
      
      * More improvements
      
      * Add docs
      
      * Remove file
      
      * Add copied from
      
      * Address comments
      
      * Address comments
      
      * Address comments
      
      * Fix style
      
      * Update docs
      
      * Convert all checkpoints, add integration test
      
      * Rename checkpoints
      
      * Add pretrained backbone attributes
      
      * Fix default config
      
      * Address comment
      
      * Add figure to docs
      
      * Fix bug thanks to @xenova
      
      * Update conversion script
      
      * Fix integration test
      963db81a
  12. 19 Jan, 2024 1 commit
  13. 18 Jan, 2024 1 commit
    • Yoach Lacombe's avatar
      Add new meta w2v2-conformer BERT-like model (#28165) · d2cdefb9
      Yoach Lacombe authored
      
      
      * first commit
      
      * correct default value non causal
      
      * update config and modeling code
      
      * update converting checkpoint
      
      * clean modeling and fix tests
      
      * make style
      
      * add new config parameters to docstring
      
      * fix copied from statements
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      
      * make position_embeddings_type docstrings clearer
      
      * clean converting script
      
      * remove function not used
      
      * clean modeling file
      
      * apply suggestion for test file + add convert script to not_doctested
      
      * modify tests according to review - cleaner logic and more tests
      
      * Apply nit suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * add checker of valid position embeddings type
      
      * instantiate new layer norm layer with the right eps
      
      * fix freeze_feature_encoder since it can be None in some cases
      
      * add test same output in convert script
      
      * restore wav2vec2conformer and add new model
      
      * create processor and FE + clean
      
      * add new model code
      
      * fix convert script and set default config parameters
      
      * correct model id paths
      
      * make style
      
      * make fix-copies and cleaning files
      
      * fix copied from statements
      
      * complete .md and fixe copies
      
      * clean convert script argument defaults
      
      * fix config parameters docstrings
      
      * fix config docstring
      
      * add copied from and enrich FE tests
      
      * fix copied from and repo-consistency
      
      * add autotokenizer
      
      * make test input length shorter and change docstring code
      
      * fix docstrings and copied from
      
      * add add_adapter to ASR training example
      
      * make testing of adapters more robust
      
      * adapt to multi adapter layers
      
      * refactor input_values->input_features and remove w2v2-bert feature extractor
      
      * remove pretraining model
      
      * remove depreciated features and useless lines
      
      * add copied from and ignore statements to modeling tests
      
      * remove pretraining model #2
      
      * change import in convert script
      
      * change default in convert script
      
      * update readme and remove useless line
      
      * Update tests/models/wav2vec2_bert/test_processor_wav2vec2_bert.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * refactor BERT to Bert for consistency
      
      * remove useless ignore copy statement
      
      * add persistent to buffer in rotary
      
      * add eps in LayerNorm init and remove copied from
      
      * add adapter activation parameters and add copied from statements
      
      * Fix copied statements and add unitest.skip reasons
      
      * add copied statement in test_processor
      
      * refactor processor
      
      * make style
      
      * replace numpy random by torch rand
      
      * remove expected output CTC
      
      * improve converting script with processor class
      
      * Apply suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * remove gumbel class
      
      * remove tests related to previously deleted class
      
      * Update src/transformers/models/wav2vec2_bert/configuration_wav2vec2_bert.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * correct typos
      
      * remove uused parameters
      
      * update processor to takes both text and audio
      
      * update checkpoints
      
      * update expected output and add ctc expected output
      
      * add label_attention_mask
      
      * replace pt with np in processor tests
      
      * fix typo
      
      * revert to behaviour with labels_attention_mask
      
      ---------
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      d2cdefb9
  14. 17 Jan, 2024 1 commit
    • Junyang Lin's avatar
      Add qwen2 (#28436) · d6ffe74d
      Junyang Lin authored
      
      
      * add config, modeling, and tokenization
      
      * add auto and init
      
      * update readme
      
      * update readme
      
      * update team name
      
      * fixup
      
      * fixup
      
      * update config
      
      * update code style
      
      * update for fixup
      
      * update for fixup
      
      * update for fixup
      
      * update for testing
      
      * update for testing
      
      * fix bug for config and tokenization
      
      * fix bug for bos token
      
      * not doctest
      
      * debug tokenizer
      
      * not doctest
      
      * debug tokenization
      
      * debug init for tokenizer
      
      * fix style
      
      * update init
      
      * delete if in token auto
      
      * add tokenizer doc
      
      * add tokenizer in init
      
      * Update dummy_tokenizers_objects.py
      
      * update
      
      * update
      
      * debug
      
      * Update tokenization_qwen2.py
      
      * debug
      
      * Update convert_slow_tokenizer.py
      
      * add copies
      
      * add copied from and make style
      
      * update files map
      
      * update test
      
      * fix style
      
      * fix merge reading and update tests
      
      * fix tests
      
      * fix tests
      
      * fix style
      
      * debug a variable in readme
      
      * Update src/transformers/models/qwen2/configuration_qwen2.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * update test and copied from
      
      * fix style
      
      * update qwen2 tokenization  and tests
      
      * Update tokenization_qwen2.py
      
      * delete the copied from after property
      
      * fix style
      
      * update tests
      
      * update tests
      
      * add copied from
      
      * fix bugs
      
      * update doc
      
      * add warning for sliding window attention
      
      * update qwen2 tokenization
      
      * fix style
      
      * Update src/transformers/models/qwen2/modeling_qwen2.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * fix tokenizer fast
      
      ---------
      Co-authored-by: default avatarRen Xuancheng <jklj077@users.noreply.github.com>
      Co-authored-by: default avatarrenxuancheng.rxc <renxuancheng.rxc@alibaba-inc.com>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      d6ffe74d
  15. 11 Jan, 2024 1 commit
  16. 10 Jan, 2024 1 commit
  17. 08 Jan, 2024 1 commit
    • NielsRogge's avatar
      Add SigLIP (#26522) · 3b742ea8
      NielsRogge authored
      
      
      * Add first draft
      
      * Use appropriate gelu function
      
      * More improvements
      
      * More improvements
      
      * More improvements
      
      * Convert checkpoint
      
      * More improvements
      
      * Improve docs, remove print statements
      
      * More improvements
      
      * Add link
      
      * remove unused masking function
      
      * begin tokenizer
      
      * do_lower_case
      
      * debug
      
      * set split_special_tokens=True
      
      * Remove script
      
      * Fix style
      
      * Fix rebase
      
      * Use same design as CLIP
      
      * Add fast tokenizer
      
      * Add SiglipTokenizer to init, remove extra_ids
      
      * Improve conversion script
      
      * Use smaller inputs in conversion script
      
      * Update conversion script
      
      * More improvements
      
      * Add processor to conversion script
      
      * Add tests
      
      * Remove print statements
      
      * Add tokenizer tests
      
      * Fix more tests
      
      * More improvements related to weight initialization
      
      * More improvements
      
      * Make more tests pass
      
      * More improvements
      
      * More improvements
      
      * Add copied from
      
      * Add canonicalize_text
      
      * Enable fast tokenizer tests
      
      * More improvements
      
      * Fix most slow tokenizer tests
      
      * Address comments
      
      * Fix style
      
      * Remove script
      
      * Address some comments
      
      * Add copied from to tests
      
      * Add more copied from
      
      * Add more copied from
      
      * Add more copied from
      
      * Remove is_flax_available
      
      * More updates
      
      * Address comment
      
      * Remove SiglipTokenizerFast for now
      
      * Add caching
      
      * Remove umt5 test
      
      * Add canonicalize_text inside _tokenize, thanks Arthur
      
      * Fix image processor tests
      
      * Skip tests which are not applicable
      
      * Skip test_initialization
      
      * More improvements
      
      * Compare pixel values
      
      * Fix doc tests, add integration test
      
      * Add do_normalize
      
      * Remove causal mask and leverage ignore copy
      
      * Fix attention_mask
      
      * Fix remaining tests
      
      * Fix dummies
      
      * Rename temperature and bias
      
      * Address comments
      
      * Add copied from to tokenizer tests
      
      * Add SiglipVisionModel to auto mapping
      
      * Add copied from to image processor tests
      
      * Improve doc
      
      * Remove SiglipVisionModel from index
      
      * Address comments
      
      * Improve docs
      
      * Simplify config
      
      * Add first draft
      
      * Make it like mistral
      
      * More improvements
      
      * Fix attention_mask
      
      * Fix output_attentions
      
      * Add note in docs
      
      * Convert multilingual model
      
      * Convert large checkpoint
      
      * Convert more checkpoints
      
      * Add pipeline support, correct image_mean and image_std
      
      * Use padding=max_length by default
      
      * Make processor like llava
      
      * Add code snippet
      
      * Convert more checkpoints
      
      * Set keep_punctuation_string=None as in OpenCLIP
      
      * Set normalized=False for special tokens
      
      * Fix doc test
      
      * Update integration test
      
      * Add figure
      
      * Update organization
      
      * Happy new year
      
      * Use AutoModel everywhere
      
      ---------
      Co-authored-by: default avatarpatil-suraj <surajp815@gmail.com>
      3b742ea8
  18. 04 Jan, 2024 1 commit
  19. 03 Jan, 2024 1 commit
    • Connor Henderson's avatar
      Add FastSpeech2Conformer (#23439) · d83ff5ee
      Connor Henderson authored
      * start - docs, SpeechT5 copy and rename
      
      * add relevant code from FastSpeech2 draft, have tests pass
      
      * make it an actual conformer, demo ex.
      
      * matching inference with original repo, includes debug code
      
      * refactor nn.Sequentials, start more desc. var names
      
      * more renaming
      
      * more renaming
      
      * vocoder scratchwork
      
      * matching vocoder outputs
      
      * hifigan vocoder conversion script
      
      * convert model script, rename some config vars
      
      * replace postnet with speecht5's implementation
      
      * passing common tests, file cleanup
      
      * expand testing, add output hidden states and attention
      
      * tokenizer + passing tokenizer tests
      
      * variety of updates and tests
      
      * g2p_en pckg setup
      
      * import structure edits
      
      * docstrings and cleanup
      
      * repo consistency
      
      * deps
      
      * small cleanup
      
      * forward signature param order
      
      * address comments except for masks and labels
      
      * address comments on attention_mask and labels
      
      * address second round of comments
      
      * remove old unneeded line
      
      * address comments part 1
      
      * address comments pt 2
      
      * rename auto mapping
      
      * fixes for failing tests
      
      * address comments part 3 (bart-like, train loss)
      
      * make style
      
      * pass config where possible
      
      * add forward method + tests to WithHifiGan model
      
      * make style
      
      * address arg passing and generate_speech comments
      
      * address Arthur comments
      
      * address Arthur comments pt2
      
      * lint  changes
      
      * Sanchit comment
      
      * add g2p-en to doctest deps
      
      * move up self.encoder
      
      * onnx compatible tensor method
      
      * fix is symbolic
      
      * fix paper url
      
      * move models to espnet org
      
      * make style
      
      * make fix-copies
      
      * update docstring
      
      * Arthur comments
      
      * update docstring w/ new updates
      
      * add model architecture images
      
      * header size
      
      * md wording update
      
      * make style
      d83ff5ee
  20. 13 Dec, 2023 2 commits
    • Lysandre's avatar
      Dev version · 3ed3e319
      Lysandre authored
      3ed3e319
    • Younes Belkada's avatar
      Adds VIP-llava to transformers (#27932) · c7f076a0
      Younes Belkada authored
      * v1
      
      * add-new-model-like
      
      * revert
      
      * fix forward and conversion script
      
      * revert
      
      * fix copies
      
      * fixup
      
      * fix
      
      * Update docs/source/en/index.md
      
      * Apply suggestions from code review
      
      * push
      
      * fix
      
      * fixes here and there
      
      * up
      
      * fixup and fix tests
      
      * Apply suggestions from code review
      
      * add docs
      
      * fixup
      
      * fixes
      
      * docstring
      
      * add docstring
      
      * fixup
      
      * docstring
      
      * fixup
      
      * nit
      
      * docs
      
      * more copies
      
      * fix copies
      
      * nit
      
      * update test
      c7f076a0
  21. 11 Dec, 2023 2 commits
    • Arthur's avatar
      [`Add Mixtral`] Adds support for the Mixtral MoE (#27942) · accccdd0
      Arthur authored
      
      
      * up
      
      * up
      
      * test
      
      * logits ok
      
      * up
      
      * up
      
      * few fixes
      
      * conversion script
      
      * up
      
      * nits
      
      * nits
      
      * update
      
      * nuke
      
      * more updates
      
      * nites
      
      * fix many issues
      
      * nit
      
      * scatter
      
      * nit
      
      * nuke megablocks
      
      * nits
      
      * fix conversion script
      
      * nit
      
      * remove
      
      * nits
      
      * nit
      
      * update
      
      * oupsssss
      
      * change
      
      * nits device
      
      * nits
      
      * fixup
      
      * update
      
      * merge
      
      * add copied from
      
      * fix the copy mentions
      
      * update tests
      
      * more fixes
      
      * nits
      
      * conversion script
      
      * add parts of the readme
      
      * Update tests/models/mixtral/test_modeling_mixtral.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * new test + conversion script
      
      * Apply suggestions from code review
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Apply suggestions from code review
      
      * fix
      
      * fix copies
      
      * fix copies
      
      * ooops
      
      * fix config
      
      * Apply suggestions from code review
      
      * fix nits
      
      * nit
      
      * add copies
      
      * add batched tests
      
      * docs
      
      * fix flash attention
      
      * let's add more verbose
      
      * add correct outputs
      
      * support router ouptus
      
      * ignore copies where needed
      
      * fix
      
      * cat list if list is given for now
      
      * nits
      
      * Update docs/source/en/model_doc/mixtral.md
      
      * finish router refactoring
      
      * fix forward
      
      * fix expected values
      
      * nits
      
      * fixup
      
      * fix
      
      * fix bug
      
      * fix
      
      * fix dtype mismatch
      
      * fix
      
      * grrr grrr I support item assignment
      
      * fix CI
      
      * docs
      
      * fixup
      
      * remove some copied form
      
      * fix weird diff
      
      * skip doctest fast on the config and modeling
      
      * mark that is supports flash attention in the doc
      
      * update
      
      * Update src/transformers/models/mixtral/modeling_mixtral.py
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      
      * Update docs/source/en/model_doc/mixtral.md
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      
      * revert router logits config issue
      
      * update doc accordingly
      
      * Update src/transformers/models/mixtral/convert_mixtral_weights_to_hf.py
      
      * nits
      
      * use torch testing asssert close
      
      * fixup
      
      * doc nits
      
      ---------
      Co-authored-by: default avataryounesbelkada <younesbelkada@gmail.com>
      Co-authored-by: default avatarYounes Belkada <49240599+younesbelkada@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      accccdd0
    • NielsRogge's avatar
      [LLaVa] Some improvements (#27895) · 7ea21f1f
      NielsRogge authored
      * More improvements
      
      * Improve variable names
      
      * Update READMEs, improve docs
      7ea21f1f
  22. 07 Dec, 2023 1 commit
    • Younes Belkada's avatar
      [`Llava`]聽Add Llava to transformers (#27662) · 44b5506d
      Younes Belkada authored
      * add model like
      
      * logits match
      
      * minor fixes
      
      * fixes
      
      * up
      
      * up
      
      * add todo
      
      * llava processor
      
      * keep the processor simple
      
      * add conversion script
      
      * fixup
      
      * fix copies
      
      * up
      
      * add to index
      
      * fix config + logits
      
      * fix
      
      * refactor
      
      * more refactor
      
      * more refactor
      
      * fix copies
      
      * add authors
      
      * v1 tests
      
      * add `LlavaProcessor` in init
      
      * remove unneeded import
      
      * up
      
      * up
      
      * docs
      
      * up
      
      * fix CI
      
      * fix CI
      
      * add attention  mask in test
      
      * make fixup
      
      * remove the vision model
      
      * that' s the dirty way to do it
      
      * nits
      
      * nits
      
      * updates
      
      * add more tests
      
      * add input tests
      
      * fixup
      
      * more styling
      
      * nits
      
      * updates amd cleanup
      
      * fixup the generation expected results
      
      * fix the testing script
      
      * some cleanup and simplification which does not work yet but almost there!
      
      * make correct dispatch operations
      
      * vectorize works for batch of images and text
      
      * last todos
      
      * nits
      
      * update test and modeling code
      
      * remove useless function for now
      
      * fix few issues
      
      * fix generation
      
      * some nits
      
      * add bakllava
      
      * nits
      
      * remove duplicated code
      
      * finis merge
      
      * cleanup
      
      * missed this line
      
      * fill the todos
      
      * add left padding offset
      
      * add left and rignt padding logic
      
      * bool to properly index
      
      * make sure
      
      * more cleanups
      
      * batch is fixed 馃槈
      
      
      
      * add correct device for tensor creation
      
      * fix some dtype missmatch
      
      * ruff
      
      * update conversion script
      
      * Update src/transformers/__init__.py
      
      * fa 2 support + fix conversion script
      
      * more
      
      * correct reshaping
      
      * fix test dict
      
      * fix copies by ignoring
      
      * fix nit
      
      * skip clip vision model
      
      * fixup
      
      * fixup
      
      * LlavaForVisionText2Text -> LlavaForCausalLM
      
      * update
      
      * fix
      
      * raise correct errors
      
      * fix
      
      * docs
      
      * nuke for now
      
      * nits here and there
      
      * fixup
      
      * fix remaining tests
      
      * update LlavaForConditionalGeneration instead of CausalLM
      
      * fixups
      
      * pipeline support
      
      * slow and piepline tests
      
      * supports batch
      
      * nits
      
      * cleanup
      
      * fix first integration tests
      
      * add pad token where needed
      
      * correct etsts
      
      * fixups
      
      * update pipeline testr
      
      * fix quality
      
      * nits
      
      * revert unneeded change
      
      * nit
      
      * use BatchFeature
      
      * from ...feature_extraction_utils import BatchFeature
      
      * nits
      
      * nits
      
      * properly update
      
      * more f*** nits
      
      * fix copies
      
      * comment
      
      * keep slow test slow
      
      * Update src/transformers/models/llava/processing_llava.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * add piepline example
      
      * add pixel values in docstrign
      
      * update pr doctest
      
      * fix
      
      * fix slow tests
      
      * remove hack
      
      * fixup
      
      * small note
      
      * forward contrib credits from PR25789
      
      * forward contrib credits from original implementation and work
      
      * add arthur
      
      * Update src/transformers/models/llava/processing_llava.py
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      
      * update docstring
      
      * nit
      
      * move to not doctested because of timeout issues
      
      * fixup
      
      * add description
      
      * more
      
      * fix-copies
      
      * fix docs
      
      * add beam search
      
      * add more comments
      
      * add typehints on processor
      
      * add speedup plot
      
      * update slow tests and docs
      
      * push test
      
      * push batched test
      
      * fix batched generation with different number of images
      
      * remove benchmark due to a bug
      
      * fix test
      
      * fix copies
      
      * add gcolab demo
      
      ---------
      Co-authored-by: default avatarArthur Zucker <arthur.zucker@gmail.com>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      Co-authored-by: default avatarshauray8 <shauray8@users.noreply.github.com>
      Co-authored-by: default avatarhaotian-liu <haotian-liu@users.noreply.github.com>
      Co-authored-by: default avatarLysandre Debut <hi@lysand.re>
      44b5506d
  23. 05 Dec, 2023 1 commit
    • Arindam Jati's avatar
      [Time series] Add PatchTSMixer (#26247) · b242d0f2
      Arindam Jati authored
      
      
      * patchtsmixer initial commit
      
      * x,y->context_values,target_values, unittest addded
      
      * cleanup code
      
      * minor
      
      * return hidden states
      
      * model tests, partial integration tests
      
      * ettm notebook temporary
      
      * minor
      
      * config mask bug fix, tests updated
      
      * final ETT notebooks
      
      * add selfattn
      
      * init
      
      * added docstrings
      
      * PatchTSMixerForPretraining -> PatchTSMixerForMaskPretraining
      
      * functionality tests added
      
      * add start and input docstrings
      
      * docstring edits
      
      * testcase edits
      
      * minor changes
      
      * docstring error fixed
      
      * ran make fixup
      
      * finalize integration tests and docs
      
      * minor
      
      * cleaned gitignore
      
      * added dataclass decorator, ran black formatter
      
      * ran ruff
      
      * formatting
      
      * add slow decorator
      
      * renamed in_Channel to input_size and default to 1
      
      * shorten dataclass names
      
      * use smaller model for testing
      
      * moved the 3 heads to the modeling file
      
      * use scalers instead of revin
      
      * support forecast_channel_indices
      
      * fix regression scaling
      
      * undo reg. scaling
      
      * removed unneeded classes
      
      * forgot missing
      
      * add more layers
      
      * add copied positional_encoding
      
      * use patchmask from patchtst
      
      * removed dependency on layers directory
      
      * formatting
      
      * set seed
      
      * removed unused imports
      
      * fixed forward signature test
      
      * adding distributional head for PatchTSMixerForecasting
      
      * add generate to forecast
      
      * testcases for generate
      
      * add generate and distributional head for regression
      
      * raise Exception for negative values for neg binominal distribution
      
      * formatting changes
      
      * remove copied from patchtst and add TODO for test passing
      
      * make copies
      
      * doc edits
      
      * minor changes
      
      * format issues
      
      * minor changes
      
      * minor changes
      
      * format docstring
      
      * change some class names to PatchTSMixer + class name
      
      Transpose to PatchTSMixerTranspose
      GatedAttention to PatchTSMixerGatedAttention
      
      * change NormLayer to PatchTSMixerNormLayer
      
      * change MLP to PatchTSMixerMLP
      
      * change PatchMixer to PatchMixerBlock, FeatureMixer to FeatureMixerBlock
      
      * change ChannelFeatureMixer to ChannelFeatureMixerBlock
      
      * change PatchMasking to PatchTSMixerMasking
      
      * change Patchify to PatchTSMixerPatchify
      
      * list to `list`
      
      * fix docstrings
      
      * formatting
      
      * change bs to batch_size, edit forecast_masking
      
      * edit random_masking
      
      * change variable name and update docstring in PatchTSMixerMasking
      
      * change variable name and update docstring in InjectScalerStatistics4D
      
      * update forward call in PatchTSMixerTranspose
      
      * change variable name and update docstring in PatchTSMixerNormLayer
      
      * change variable name and update docstring in PatchTSMixerMLP
      
      * change variable name and update docstring in ChannelFeatureMixerBlock
      
      * formatting
      
      * formatting issues
      
      * docstring issue
      
      * fixed observed_mask type in docstrings
      
      * use FloatTensor type
      
      * formatting
      
      * fix rescaling issue in forecasting, fixed integration tests
      
      * add docstring from decorator
      
      * fix docstring
      
      * Update README.md
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/patchtsmixer/configuration_patchtsmixer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/patchtsmixer/configuration_patchtsmixer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * PatchTSMixerChannelFeatureMixerBlock
      
      * formatting
      
      * ForPretraining
      
      * use num_labels instead of n_classes
      
      * remove commented out code
      
      * docstring fixed
      
      * nn.functional used instead of one letter F
      
      * x_tmp renamed
      
      * one letter variable x removed from forward calls
      
      * one letter variable y removed
      
      * remove commented code
      
      * rename patch_size, in_channels, PatchTSMixerBackbone
      
      * add config to heads
      
      * add config to heads tests
      
      * code reafactoring to use config instead of passing individual params
      
      * Cdocstring fixes part 1
      
      * docstring fixes part 2
      
      * removed logger.debug
      
      * context_values -> past_values
      
      * formatting changes
      
      * pe -> positional_encoding
      
      * removed unused target variable
      
      * self.mode logic fixed
      
      * formatting change
      
      * edit docstring and var name
      
      * change n_targets to num_targets
      
      * rename input_size to num_input_channels
      
      * add head names with prefix PatchTSMixer
      
      * edit docstring in PatchTSMixerForRegression
      
      * fix var name change in testcases
      
      * add PatchTSMixerAttention
      
      * return dict for all exposed classes, test cases added
      
      * format
      
      * move loss function to forward call
      
      * make style
      
      * adding return dict/tuple
      
      * make repo-consistency
      
      * remove flatten mode
      
      * code refactoring
      
      * rename data
      
      * remove PatchTSMixer and keep only PatchTSMixerEncoder
      
      * docstring fixes
      
      * removed unused code
      
      * format
      
      * format
      
      * remove contiguous and formatting changes
      
      * remove model description from config
      
      * replace asserts with ValueError
      
      * remove nn.Sequential from PatchTSMixerNormLayer
      
      * replace if-else with map
      
      * remove all nn.Sequential
      
      * format
      
      * formatting
      
      * fix gradient_checkpointing error after merge, and formatting
      
      * make fix-copies
      
      * remove comments
      
      * reshape
      
      * doesnt support gradient checkpointing
      
      * corect Patchify
      
      * masking updates
      
      * batchnorm copy from
      
      * format checks
      
      * scaler edits
      
      * remove comments
      
      * format changes
      
      * remove self.config
      
      * correct class PatchTSMixerMLP(nn.Module):
      
      * makr fix
      
      * doc updates
      
      * fix-copies
      
      * scaler class correction
      
      * doc edits
      
      * scaler edits
      
      * update readme with links
      
      * injectstatistics add
      
      * fix-copies
      
      * add norm_eps option to LayerNorm
      
      * format changes
      
      * fix copies
      
      * correct make copies
      
      * use parametrize
      
      * fix doc string
      
      * add docs to toctree
      
      * make style
      
      * doc segmenting
      
      * docstring edit
      
      * change forecast to prediction
      
      * edit doc
      
      * doc edits
      
      * remove PatchTSMixerTranspose
      
      * add PatchTSMixerPositionalEncoding and init position_enc
      
      * remove positional_encoding
      
      * edit forecast_masking, remove forecast_mask_ratios
      
      * fix broken code
      
      * var rename target_values -> future_values
      
      * num_features -> d_model
      
      * fix broken code after master merge
      
      * repo consistency
      
      * use postional embedding
      
      * prediction_logits -> prediction_outputs, make fix-copies
      
      * uncommented @slow
      
      * minor changes
      
      * loss first in tuple
      
      * tuple and dict same ordering
      
      * style edits
      
      * minor changes
      
      * dict/tuple consistent enablement
      
      * Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * fix formatting
      
      * formatting
      
      * usage tip
      
      * test on cpu only
      
      * add sample usage
      
      * change PatchTSMixerForClassification to PatchTSMixerForTimeSeriesClassification
      
      * push changes
      
      * fix copies
      
      * std scaling set to default True case
      
      * minor changes
      
      * stylechanges
      
      ---------
      Co-authored-by: default avatarArindam Jati <arindam.jati@ibm.com>
      Co-authored-by: default avatarvijaye12 <vijaye12@in.ibm.com>
      Co-authored-by: default avatarKashif Rasul <kashif.rasul@gmail.com>
      Co-authored-by: default avatarnnguyen <nnguyen@us.ibm.com>
      Co-authored-by: default avatarvijaye12 <vijaykr.e@gmail.com>
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      Co-authored-by: default avatarNam Nguyen <namctin@gmail.com>
      Co-authored-by: default avatarWesley Gifford <79663411+wgifford@users.noreply.github.com>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      b242d0f2
  24. 01 Dec, 2023 1 commit
  25. 30 Nov, 2023 1 commit
    • Yoach Lacombe's avatar
      Add SeamlessM4T v2 (#27779) · 29f1aee3
      Yoach Lacombe authored
      
      
      * add working convertion script
      
      * first non-working version of modeling code
      
      * update modeling code (working)
      
      * make style
      
      * make fix-copies
      
      * add config docstrings
      
      * add config to ignore docstrings formatage due to unconventional markdown
      
      * fix copies
      
      * fix generation num_return_sequences
      
      * enrich docs
      
      * add and fix tests beside integration tests
      
      * update integration tests
      
      * update repo id
      
      * add tie weights and make style
      
      * correct naming in .md
      
      * fix imports and so on
      
      * correct docstrings
      
      * fix fp16 speech forward
      
      * fix speechencoder attention
      
      * make style
      
      * fix copied from
      
      * rename SeamlessM4Tv2-v2 to SeamlessM4Tv2
      
      * Apply suggestions on configuration
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * remove useless public models
      
      * fix private models + better naming for T2U models
      
      * clean speech encoder relative position embeddings
      
      * refactor chunk attention
      
      * add docstrings to chunk attention method
      
      * improve naming and docstrings
      
      * rename some attention variables + add temperature sampling in T2U model
      
      * rename DOCSTRINGS variable names
      
      * make style + remove 2 useless config parameters
      
      * enrich model card
      
      * remove any attention_head reference + fix temperature in T2U
      
      * new fmt and make style
      
      * Apply suggestions from code review
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * rename spkr_id->speaker_id and change docstrings of get_char_input_ids
      
      * simplify v2attention
      
      * make style
      
      * Update seamless_m4t_v2.md
      
      * update code and tests with last update
      
      * update repo ids
      
      * fill article name, abstract andauthors
      
      * update not_doctested and slow_doc tests
      
      ---------
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      29f1aee3
  26. 29 Nov, 2023 1 commit
    • Kashif Rasul's avatar
      [Time series] Add patchtst (#27581) · af8acc47
      Kashif Rasul authored
      
      
      * add distribution head to forecasting
      
      * formatting
      
      * Add generate function for forecasting
      
      * Add generate function to prediction task
      
      * formatting
      
      * use argsort
      
      * add past_observed_mask ordering
      
      * fix arguments
      
      * docs
      
      * add back test_model_outputs_equivalence test
      
      * formatting
      
      * cleanup
      
      * formatting
      
      * use ACT2CLS
      
      * formatting
      
      * fix add_start_docstrings decorator
      
      * add distribution head and generate function to regression task
      
      add distribution head and generate function to regression task. Also made add PatchTSTForForecastingOutput,  PatchTSTForRegressionOutput.
      
      * add distribution head and generate function to regression task
      
      add distribution head and generate function to regression task. Also made add PatchTSTForForecastingOutput,  PatchTSTForRegressionOutput.
      
      * fix typos
      
      * add forecast_masking
      
      * fixed tests
      
      * use set_seed
      
      * fix doc test
      
      * formatting
      
      * Update docs/source/en/model_doc/patchtst.md
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * better var names
      
      * rename PatchTSTTranspose
      
      * fix argument names and docs string
      
      * remove compute_num_patches and unused class
      
      * remove assert
      
      * renamed to PatchTSTMasking
      
      * use num_labels for classification
      
      * use num_labels
      
      * use default num_labels from super class
      
      * move model_type after docstring
      
      * renamed PatchTSTForMaskPretraining
      
      * bs -> batch_size
      
      * more review fixes
      
      * use hidden_state
      
      * rename encoder layer and block class
      
      * remove commented seed_number
      
      * edit docstring
      
      * Add docstring
      
      * formatting
      
      * use past_observed_mask
      
      * doc suggestion
      
      * make fix-copies
      
      * use Args:
      
      * add docstring
      
      * add docstring
      
      * change some variable names and add PatchTST before some class names
      
      * formatting
      
      * fix argument types
      
      * fix tests
      
      * change x variable to patch_input
      
      * format
      
      * formatting
      
      * fix-copies
      
      * Update tests/models/patchtst/test_modeling_patchtst.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * move loss to forward
      
      * Update src/transformers/models/patchtst/modeling_patchtst.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/models/patchtst/modeling_patchtst.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/models/patchtst/modeling_patchtst.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/models/patchtst/modeling_patchtst.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/models/patchtst/modeling_patchtst.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * formatting
      
      * fix a bug when pre_norm is set to True
      
      * output_hidden_states is set to False as default
      
      * set pre_norm=True as default
      
      * format docstring
      
      * format
      
      * output_hidden_states is None by default
      
      * add missing docs
      
      * better var names
      
      * docstring: remove default to False in output_hidden_states
      
      * change labels name to target_values in regression task
      
      * format
      
      * fix tests
      
      * change to forecast_mask_ratios and random_mask_ratio
      
      * change mask names
      
      * change future_values to target_values param in the prediction class
      
      * remove nn.Sequential and make PatchTSTBatchNorm class
      
      * black
      
      * fix argument name for prediction
      
      * add output_attentions option
      
      * add output_attentions to PatchTSTEncoder
      
      * formatting
      
      * Add attention output option to all classes
      
      * Remove PatchTSTEncoderBlock
      
      * create PatchTSTEmbedding class
      
      * use config in PatchTSTPatchify
      
      * Use config in PatchTSTMasking class
      
      * add channel_attn_weights
      
      * Add PatchTSTScaler class
      
      * add output_attentions arg to test function
      
      * format
      
      * Update doc with image patchtst.md
      
      * fix-copies
      
      * rename Forecast <-> Prediction
      
      * change name of a few parameters to match with PatchTSMixer.
      
      * Remove *ForForecasting class to match with other time series models.
      
      * make style
      
      * Remove PatchTSTForForecasting in the test
      
      * remove PatchTSTForForecastingOutput class
      
      * change test_forecast_head to test_prediction_head
      
      * style
      
      * fix docs
      
      * fix tests
      
      * change num_labels to num_targets
      
      * Remove PatchTSTTranspose
      
      * remove arguments in PatchTSTMeanScaler
      
      * remove arguments in PatchTSTStdScaler
      
      * add config as an argument to all the scaler classes
      
      * reformat
      
      * Add norm_eps for batchnorm and layernorm
      
      * reformat.
      
      * reformat
      
      * edit docstring
      
      * update docstring
      
      * change variable name pooling to pooling_type
      
      * fix output_hidden_states as tuple
      
      * fix bug when calling PatchTSTBatchNorm
      
      * change stride to patch_stride
      
      * create PatchTSTPositionalEncoding class and restructure the PatchTSTEncoder
      
      * formatting
      
      * initialize scalers with configs
      
      * edit output_hidden_states
      
      * style
      
      * fix forecast_mask_patches doc string
      
      * doc improvements
      
      * move summary to the start
      
      * typo
      
      * fix docstring
      
      * turn off masking when using prediction, regression, classification
      
      * return scaled output
      
      * adjust output when using distribution head
      
      * remove _num_patches function in the config
      
      * get config.num_patches from patchifier init
      
      * add output_attentions docstring, remove tuple in output_hidden_states
      
      * change SamplePatchTSTPredictionOutput and SamplePatchTSTRegressionOutput to SamplePatchTSTOutput
      
      * remove print("model_class: ", model_class)
      
      * change encoder_attention_heads to num_attention_heads
      
      * change norm to norm_layer
      
      * change encoder_layers to num_hidden_layers
      
      * change shared_embedding to share_embedding, shared_projection to share_projection
      
      * add output_attentions
      
      * more robust check of norm_type
      
      * change dropout_path to path_dropout
      
      * edit docstring
      
      * remove positional_encoding function and add _init_pe in PatchTSTPositionalEncoding
      
      * edit shape of cls_token and initialize it
      
      * add a check on the num_input_channels.
      
      * edit head_dim in the Prediction class to allow the use of cls_token
      
      * remove some positional_encoding_type options, remove learn_pe arg, initalize pe
      
      * change Exception to ValueError
      
      * format
      
      * norm_type is "batchnorm"
      
      * make style
      
      * change cls_token shape
      
      * Change forecast_mask_patches to num_mask_patches. Remove forecast_mask_ratios.
      
      * Bring PatchTSTClassificationHead on top of PatchTSTForClassification
      
      * change encoder_ffn_dim to ffn_dim and edit the docstring.
      
      * update variable names to match with the config
      
      * add generation tests
      
      * change num_mask_patches to num_forecast_mask_patches
      
      * Add examples explaining the use of these models
      
      * make style
      
      * Revert "Revert "[time series] Add PatchTST (#25927)" (#27486)"
      
      This reverts commit 78f6ed6c
      
      .
      
      * make style
      
      * fix default std scaler's minimum_scale
      
      * fix docstring
      
      * close code blocks
      
      * Update docs/source/en/model_doc/patchtst.md
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update tests/models/patchtst/test_modeling_patchtst.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/patchtst/modeling_patchtst.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/patchtst/configuration_patchtst.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/patchtst/modeling_patchtst.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/patchtst/modeling_patchtst.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/patchtst/modeling_patchtst.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/patchtst/modeling_patchtst.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/patchtst/modeling_patchtst.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/patchtst/modeling_patchtst.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/patchtst/modeling_patchtst.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * fix tests
      
      * add add_start_docstrings
      
      * move examples to the forward's docstrings
      
      * update prepare_batch
      
      * update test
      
      * fix test_prediction_head
      
      * fix generation test
      
      * use seed to create generator
      
      * add output_hidden_states and config.num_patches
      
      * add loc and scale args in PatchTSTForPredictionOutput
      
      * edit outputs if if not return_dict
      
      * use self.share_embedding to check instead checking type.
      
      * remove seed
      
      * make style
      
      * seed is an optional int
      
      * fix test
      
      * generator device
      
      * Fix assertTrue test
      
      * swap order of items in outputs when return_dict=False.
      
      * add mask_type and random_mask_ratio to unittest
      
      * Update modeling_patchtst.py
      
      * add add_start_docstrings for regression model
      
      * make style
      
      * update model path
      
      * Edit the ValueError comment in forecast_masking
      
      * update examples
      
      * make style
      
      * fix commented code
      
      * update examples: remove config from from_pretrained call
      
      * Edit example outputs
      
      * Set default target_values to None
      
      * remove config setting in regression example
      
      * Update configuration_patchtst.py
      
      * Update configuration_patchtst.py
      
      * remove config from examples
      
      * change default d_model and ffn_dim
      
      * norm_eps default
      
      * set has_attentions to Trye and define self.seq_length = self.num_patche
      
      * update docstring
      
      * change variable mask_input to do_mask_input
      
      * fix blank space.
      
      * change logger.debug to logger.warning.
      
      * remove unused PATCHTST_INPUTS_DOCSTRING
      
      * remove all_generative_model_classes
      
      * set test_missing_keys=True
      
      * remove undefined params in the docstring.
      
      ---------
      Co-authored-by: default avatarnnguyen <nnguyen@us.ibm.com>
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarNam Nguyen <namctin@gmail.com>
      Co-authored-by: default avatarWesley Gifford <79663411+wgifford@users.noreply.github.com>
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      af8acc47
  27. 28 Nov, 2023 1 commit
  28. 22 Nov, 2023 1 commit
    • dg845's avatar
      Add UnivNet Vocoder Model for Tortoise TTS Diffusers Integration (#24799) · 7f6a804d
      dg845 authored
      * initial commit
      
      * Add inital testing files and modify __init__ files to add UnivNet imports.
      
      * Fix some bugs
      
      * Add checkpoint conversion script and add references to transformers pre-trained model.
      
      * Add UnivNet entries for auto.
      
      * Add initial docs for UnivNet.
      
      * Handle input and output shapes in UnivNetGan.forward and add initial docstrings.
      
      * Write tests and make them pass.
      
      * Write docs.
      
      * Add UnivNet doc to _toctree.yml and improve docs.
      
      * fix typo
      
      * make fixup
      
      * make fix-copies
      
      * Add upsample_rates parameter to config and improve config documentation.
      
      * make fixup
      
      * make fix-copies
      
      * Remove unused upsample_rates config parameter.
      
      * apply suggestions from review
      
      * make style
      
      * Verify and add reason for skipped tests inherited from ModelTesterMixin.
      
      * Add initial UnivNetGan integration tests
      
      * make style
      
      * Remove noise_length input to UnivNetGan and improve integration tests.
      
      * Fix bug and make style
      
      * Make UnivNet integration tests pass
      
      * Add initial code for UnivNetFeatureExtractor.
      
      * make style
      
      * Add initial tests for UnivNetFeatureExtractor.
      
      * make style
      
      * Properly initialize weights for UnivNetGan
      
      * Get feature extractor fast tests passing
      
      * make style
      
      * Get feature extractor integration tests passing
      
      * Get UnivNet integration tests passing
      
      * make style
      
      * Add UnivNetGan usage example
      
      * make style and use feature extractor from hub in integration tests
      
      * Update tips in docs
      
      * apply suggestions from review
      
      * make style
      
      * Calculate padding directly instead of using get_padding methods.
      
      * Update UnivNetFeatureExtractor.to_dict to be UnivNet-specific.
      
      * Update feature extractor to support using model(**inputs) and add the ability to generate noise and pad the end of the spectrogram in __call__.
      
      * Perform padding before generating noise to ensure the shapes are correct.
      
      * Rename UnivNetGan.forward's noise_waveform argument to noise_sequence.
      
      * make style
      
      * Add tests to test generating noise and padding the end for UnivNetFeatureExtractor.__call__.
      
      * Add tests for checking batched vs unbatched inputs for UnivNet feature extractor and model.
      
      * Add expected mean and stddev checks to the integration tests and make them pass.
      
      * make style
      
      * Make it possible to use model(**inputs), where inputs is the output of the feature extractor.
      
      * fix typo in UnivNetGanConfig example
      
      * Calculate spectrogram_zero from other config values.
      
      * apply suggestions from review
      
      * make style
      
      * Refactor UnivNet conversion script to use load_state_dict (following persimmon).
      
      * Rename UnivNetFeatureExtractor to UnivNetGanFeatureExtractor.
      
      * make style
      
      * Switch to using torch.tensor and torch.testing.assert_close for testing expected values/slices.
      
      * make style
      
      * Use config in UnivNetGan modeling blocks.
      
      * make style
      
      * Rename the spectrogram argument of UnivNetGan.forward to input_features, following Whisper.
      
      * make style
      
      * Improving padding documentation.
      
      * Add UnivNet usage example to the docs.
      
      * apply suggestions from review
      
      * Move dynamic_range_compression computation into the mel_spectrogram method of the feature extractor.
      
      * Improve UnivNetGan.forward return docstring.
      
      * Update table in docs/source/en/index.md.
      
      * make fix-copies
      
      * Rename UnivNet components to have pattern UnivNet*.
      
      * make style
      
      * make fix-copies
      
      * Update docs
      
      * make style
      
      * Increase tolerance on flaky unbatched integration test.
      
      * Remove torch.no_grad decorators from UnivNet integration tests to try to avoid flax/Tensorflow test errors.
      
      * Add padding_mask argument to UnivNetModel.forward and add batch_decode feature extractor method to remove padding.
      
      * Update documentation and clean up padding code.
      
      * make style
      
      * make style
      
      * Remove torch dependency from UnivNetFeatureExtractor.
      
      * make style
      
      * Fix UnivNetModel usage example
      
      * Clean up feature extractor code/docstrings.
      
      * apply suggestions from review
      
      * make style
      
      * Add comments for tests skipped via ModelTesterMixin flags.
      
      * Add comment for model parallel tests skipped via the test_model_parallel ModelTesterMixin flag.
      
      * Add # Copied from statements to copied UnivNetFeatureExtractionTest tests.
      
      * Simplify UnivNetFeatureExtractorTest.test_batch_decode.
      
      * Add support for unbatched padding_masks in UnivNetModel.forward.
      
      * Refactor unbatched padding_mask support.
      
      * make style
      7f6a804d
  29. 21 Nov, 2023 1 commit
    • jiqing-feng's avatar
      TVP model (#25856) · c770600f
      jiqing-feng authored
      * tvp model for video grounding
      
      add tokenizer auto
      
      fix param in TVPProcessor
      
      add docs
      
      clear comments and enable different torch dtype
      
      add image processor test and model test and fix code style
      
      * fix conflict
      
      * fix model doc
      
      * fix image processing tests
      
      * fix tvp tests
      
      * remove torch in processor
      
      * fix grammar error
      
      * add more details on tvp.md
      
      * fix model arch for loss, grammar, and processor
      
      * add docstring and do not regard TvpTransformer, TvpVisionModel as individual model
      
      * use pad_image
      
      * update copyright
      
      * control first downsample stride
      
      * reduce first only works for ResNetBottleNeckLayer
      
      * fix param name
      
      * fix style
      
      * add testing
      
      * fix style
      
      * rm init_weight
      
      * fix style
      
      * add post init
      
      * fix comments
      
      * do not test TvpTransformer
      
      * fix warning
      
      * fix style
      
      * fix example
      
      * fix config map
      
      * add link in config
      
      * fix comments
      
      * fix style
      
      * rm useless param
      
      * change attention
      
      * change test
      
      * add notes
      
      * fix comments
      
      * fix tvp
      
      * import checkpointing
      
      * fix gradient checkpointing
      
      * Use a more accurate example in readme
      
      * update
      
      * fix copy
      
      * fix style
      
      * update readme
      
      * delete print
      
      * remove tvp test_forward_signature
      
      * remove TvpTransformer
      
      * fix test init model
      
      * merge main and make style
      
      * fix tests and others
      
      * fix image processor
      
      * fix style and model_input_names
      
      * fix tests
      c770600f
  30. 14 Nov, 2023 1 commit
  31. 13 Nov, 2023 1 commit
    • Gift Sinthong's avatar
      [time series] Add PatchTST (#25927) · 2ac5b932
      Gift Sinthong authored
      
      
      * Initial commit of PatchTST model classes
      Co-authored-by: default avatarPhanwadee Sinthong <phsinthong@gmail.com>
      Co-authored-by: default avatarNam Nguyen <namctin@gmail.com>
      Co-authored-by: default avatarVijay Ekambaram <vijaykr.e@gmail.com>
      Co-authored-by: default avatarNgoc Diep Do <55230119+diepi@users.noreply.github.com>
      Co-authored-by: default avatarWesley Gifford <79663411+wgifford@users.noreply.github.com>
      
      * Add PatchTSTForPretraining
      
      * update to include classification
      Co-authored-by: default avatarPhanwadee Sinthong <phsinthong@gmail.com>
      Co-authored-by: default avatarNam Nguyen <namctin@gmail.com>
      Co-authored-by: default avatarVijay Ekambaram <vijaykr.e@gmail.com>
      Co-authored-by: default avatarNgoc Diep Do <55230119+diepi@users.noreply.github.com>
      Co-authored-by: default avatarWesley Gifford <79663411+wgifford@users.noreply.github.com>
      
      * clean up auto files
      
      * Add PatchTSTForPrediction
      
      * Fix relative import
      
      * Replace original PatchTSTEncoder with ChannelAttentionPatchTSTEncoder
      
      * temporary adding absolute path + add PatchTSTForForecasting class
      
      * Update base PatchTSTModel + Unittest
      
      * Update ForecastHead to use the config class
      
      * edit cv_random_masking, add mask to model output
      
      * Update configuration_patchtst.py
      
      * add masked_loss to the pretraining
      
      * add PatchEmbeddings
      
      * Update configuration_patchtst.py
      
      * edit loss which considers mask in the pretraining
      
      * remove patch_last option
      
      * Add commits from internal repo
      
      * Update ForecastHead
      
      * Add model weight initilization + unittest
      
      * Update PatchTST unittest to use local import
      
      * PatchTST integration tests for pretraining and prediction
      
      * Added PatchTSTForRegression + update unittest to include label generation
      
      * Revert unrelated model test file
      
      * Combine similar output classes
      
      * update PredictionHead
      
      * Update configuration_patchtst.py
      
      * Add Revin
      
      * small edit to PatchTSTModelOutputWithNoAttention
      
      * Update modeling_patchtst.py
      
      * Updating integration test for forecasting
      
      * Fix unittest after class structure changed
      
      * docstring updates
      
      * change input_size to num_input_channels
      
      * more formatting
      
      * Remove some unused params
      
      * Add a comment for pretrained models
      
      * add channel_attention option
      
      add channel_attention option and remove unused positional encoders.
      
      * Update PatchTST models to use HF's MultiHeadAttention module
      
      * Update paper + github urls
      
      * Fix hidden_state return value
      
      * Update integration test to use PatchTSTForForecasting
      
      * Adding dataclass decorator for model output classes
      
      * Run fixup script
      
      * Rename model repos for integration test
      
      * edit argument explanation
      
      * change individual option to shared_projection
      
      * style
      
      * Rename integration test + import cleanup
      
      * Fix outpu_hidden_states return value
      
      * removed unused mode
      
      * added std, mean and nops scaler
      
      * add initial distributional loss for predition
      
      * fix typo in docs
      
      * add generate function
      
      * formatting
      
      * add num_parallel_samples
      
      * Fix a typo
      
      * copy weighted_average function, edit PredictionHead
      
      * edit PredictionHead
      
      * add distribution head to forecasting
      
      * formatting
      
      * Add generate function for forecasting
      
      * Add generate function to prediction task
      
      * formatting
      
      * use argsort
      
      * add past_observed_mask ordering
      
      * fix arguments
      
      * docs
      
      * add back test_model_outputs_equivalence test
      
      * formatting
      
      * cleanup
      
      * formatting
      
      * use ACT2CLS
      
      * formatting
      
      * fix add_start_docstrings decorator
      
      * add distribution head and generate function to regression task
      
      add distribution head and generate function to regression task. Also made add PatchTSTForForecastingOutput,  PatchTSTForRegressionOutput.
      
      * add distribution head and generate function to regression task
      
      add distribution head and generate function to regression task. Also made add PatchTSTForForecastingOutput,  PatchTSTForRegressionOutput.
      
      * fix typos
      
      * add forecast_masking
      
      * fixed tests
      
      * use set_seed
      
      * fix doc test
      
      * formatting
      
      * Update docs/source/en/model_doc/patchtst.md
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * better var names
      
      * rename PatchTSTTranspose
      
      * fix argument names and docs string
      
      * remove compute_num_patches and unused class
      
      * remove assert
      
      * renamed to PatchTSTMasking
      
      * use num_labels for classification
      
      * use num_labels
      
      * use default num_labels from super class
      
      * move model_type after docstring
      
      * renamed PatchTSTForMaskPretraining
      
      * bs -> batch_size
      
      * more review fixes
      
      * use hidden_state
      
      * rename encoder layer and block class
      
      * remove commented seed_number
      
      * edit docstring
      
      * Add docstring
      
      * formatting
      
      * use past_observed_mask
      
      * doc suggestion
      
      * make fix-copies
      
      * use Args:
      
      * add docstring
      
      * add docstring
      
      * change some variable names and add PatchTST before some class names
      
      * formatting
      
      * fix argument types
      
      * fix tests
      
      * change x variable to patch_input
      
      * format
      
      * formatting
      
      * fix-copies
      
      * Update tests/models/patchtst/test_modeling_patchtst.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * move loss to forward
      
      * Update src/transformers/models/patchtst/modeling_patchtst.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/models/patchtst/modeling_patchtst.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/models/patchtst/modeling_patchtst.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/models/patchtst/modeling_patchtst.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * Update src/transformers/models/patchtst/modeling_patchtst.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * formatting
      
      * fix a bug when pre_norm is set to True
      
      * output_hidden_states is set to False as default
      
      * set pre_norm=True as default
      
      * format docstring
      
      * format
      
      * output_hidden_states is None by default
      
      * add missing docs
      
      * better var names
      
      * docstring: remove default to False in output_hidden_states
      
      * change labels name to target_values in regression task
      
      * format
      
      * fix tests
      
      * change to forecast_mask_ratios and random_mask_ratio
      
      * change mask names
      
      * change future_values to target_values param in the prediction class
      
      * remove nn.Sequential and make PatchTSTBatchNorm class
      
      * black
      
      * fix argument name for prediction
      
      * add output_attentions option
      
      * add output_attentions to PatchTSTEncoder
      
      * formatting
      
      * Add attention output option to all classes
      
      * Remove PatchTSTEncoderBlock
      
      * create PatchTSTEmbedding class
      
      * use config in PatchTSTPatchify
      
      * Use config in PatchTSTMasking class
      
      * add channel_attn_weights
      
      * Add PatchTSTScaler class
      
      * add output_attentions arg to test function
      
      * format
      
      * Update doc with image patchtst.md
      
      * fix-copies
      
      * rename Forecast <-> Prediction
      
      * change name of a few parameters to match with PatchTSMixer.
      
      * Remove *ForForecasting class to match with other time series models.
      
      * make style
      
      * Remove PatchTSTForForecasting in the test
      
      * remove PatchTSTForForecastingOutput class
      
      * change test_forecast_head to test_prediction_head
      
      * style
      
      * fix docs
      
      * fix tests
      
      * change num_labels to num_targets
      
      * Remove PatchTSTTranspose
      
      * remove arguments in PatchTSTMeanScaler
      
      * remove arguments in PatchTSTStdScaler
      
      * add config as an argument to all the scaler classes
      
      * reformat
      
      * Add norm_eps for batchnorm and layernorm
      
      * reformat.
      
      * reformat
      
      * edit docstring
      
      * update docstring
      
      * change variable name pooling to pooling_type
      
      * fix output_hidden_states as tuple
      
      * fix bug when calling PatchTSTBatchNorm
      
      * change stride to patch_stride
      
      * create PatchTSTPositionalEncoding class and restructure the PatchTSTEncoder
      
      * formatting
      
      * initialize scalers with configs
      
      * edit output_hidden_states
      
      * style
      
      * fix forecast_mask_patches doc string
      
      ---------
      Co-authored-by: default avatarGift Sinthong <gift.sinthong@ibm.com>
      Co-authored-by: default avatarNam Nguyen <namctin@gmail.com>
      Co-authored-by: default avatarVijay Ekambaram <vijaykr.e@gmail.com>
      Co-authored-by: default avatarNgoc Diep Do <55230119+diepi@users.noreply.github.com>
      Co-authored-by: default avatarWesley Gifford <79663411+wgifford@users.noreply.github.com>
      Co-authored-by: default avatarWesley M. Gifford <wmgifford@us.ibm.com>
      Co-authored-by: default avatarnnguyen <nnguyen@us.ibm.com>
      Co-authored-by: default avatarNgoc Diep Do <diiepy@gmail.com>
      Co-authored-by: default avatarKashif Rasul <kashif.rasul@gmail.com>
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      2ac5b932
  32. 10 Nov, 2023 2 commits
    • Susnato Dhar's avatar
      Add Phi-1 and Phi-1_5 (#26170) · e1c3ac25
      Susnato Dhar authored
      * only dir not even init
      
      * init
      
      * tokenizer removed and reference of codegen added
      
      * modeling file updated a lot remaining app_rotary_emb
      
      * conversion script done
      
      * conversion script fixed, a lot of factoring done and most tests pass
      
      * added token_clf and extractive_QA_head
      
      * integration tests pass
      
      * flash attn tests pass!
      
      * config done
      
      * more docs in modeling file
      
      * some style fix
      
      * style and others
      
      * doc test error fix
      
      * more doc fix
      
      * some attention fixes
      
      * most fixes
      
      * style and other fixes
      
      * docs fix and config
      
      * doc fix
      
      * some comments
      
      * conversion script updated
      
      * conversion script updated
      
      * Revert "conversion script updated"
      
      This reverts commit e92378c54084ec0747041b113083d1746ecb6c7f.
      
      * final comments
      
      * add Phi to language_modeling.md
      
      * edit phi.md file
      
      * rebase and fix
      
      * removed phi-1.5 example
      
      * changed model_type from 'phi'->'mixformer-sequential'
      
      * small change
      
      * small change
      
      * revert \small change
      
      * changed mixformer-sequential->phi
      
      * small change
      
      * added phi-1.5 example instead of phi-1
      
      * doc test might pass now
      
      * rebase and small change
      
      * added the dropout layer
      
      * more fixes
      
      * modified .md file
      
      * very very small doc change
      e1c3ac25
    • Susnato Dhar's avatar
      Add CLVP (#24745) · 7e9f10ac
      Susnato Dhar authored
      * init commit
      
      * attention arch done except rotary emb
      
      * rotary emb done
      
      * text encoder working
      
      * outputs matching
      
      * arch first pass done
      
      * make commands done, tests and docs remaining
      
      * all tests passed, only docs remaining
      
      * docs done
      
      * doc-builder fix
      
      * convert script removed(not relevant)
      
      * minor comments done
      
      * added ckpt conversion script
      
      * tokenizer done
      
      * very minor fix of index.md 2
      
      * mostly make fixup related
      
      * all done except fe and rotary emb
      
      * very small change
      
      * removed unidecode dependency
      
      * style changes
      
      * tokenizer removed require_backends
      
      * added require_inflect to tokenizer tests
      
      * removed VOCAB_FILES in tokenizer test
      
      * inflect dependency removed
      
      * added rotary pos emb cache and simplified the apply method
      
      * style
      
      * little doc change
      
      * more comments
      
      * feature extractor added
      
      * added processor
      
      * auto-regressive config added
      
      * added CLVPConditioningEncoder
      
      * comments done except the test one
      
      * weights added successfull(NOT tested)
      
      * tokenizer fix with numbers
      
      * generate outputs matching
      
      * almost tests passing Integ tests not written
      
      * Integ tests added
      
      * major CUDA error fixed
      
      * docs done
      
      * rebase and multiple fixes
      
      * fixed rebase overwrites
      
      * generate code simplified and tests for AutoRegressive model added
      
      * minor changes
      
      * refectored gpt2 code in clvp file
      
      * weights done and all code refactored
      
      * mostly done except the fast_tokenizer
      
      * doc test fix
      
      * config file's doc fixes
      
      * more config fix
      
      * more comments
      
      * tokenizer comments mostly done
      
      * modeling file mostly refactored and can load modules
      
      * ClvpEncoder tested
      
      * ClvpDecoder, ClvpModel and ClvpForCausalLM tested
      
      * integration and all tests passed
      
      * more fixes
      
      * docs almost done
      
      * ckpt conversion refectored
      
      * style and some failing tests fix
      
      * comments
      
      * temporary output fix but test_assisted_decoding_matches_greedy_search test fails
      
      * majority changes done
      
      * use_cache outputs same now! Along with the asisted_greedy_decoding test fix
      
      * more comments
      
      * more comments
      
      * prepare_inputs_for_generation fixed and _prepare_model_inputs added
      
      * style fix
      
      * clvp.md change
      
      * moved clvpconditionalencoder norms
      
      * add model to new index
      
      * added tokenizer input_ids_with_special_tokens
      
      * small fix
      
      * config mostly done
      
      * added config-tester and changed conversion script
      
      * more comments
      
      * comments
      
      * style fix
      
      * some comments
      
      * tokenizer changed back to prev state
      
      * small commnets
      
      * added output hidden states for the main model
      
      * style fix
      
      * comments
      
      * small change
      
      * revert small change
      
      * .
      
      * Update clvp.md
      
      * Update test_modeling_clvp.py
      
      * :)
      
      * some minor change
      
      * new fixes
      
      * remove to_dict from FE
      7e9f10ac
  33. 02 Nov, 2023 1 commit
  34. 31 Oct, 2023 1 commit
  35. 30 Oct, 2023 1 commit