1. 12 Oct, 2022 1 commit
    • NielsRogge's avatar
      Add LiLT (#19450) · 4d367a3c
      NielsRogge authored
      
      
      * First draft
      
      * Fix more things
      
      * Improve more things
      
      * Remove some head models
      
      * Fix more things
      
      * Add missing layers
      
      * Remove tokenizer
      
      * Fix more things
      
      * Fix copied from statements
      
      * Make all tests pass
      
      * Remove print statements
      
      * Remove files
      
      * Fix README and docs
      
      * Add integration test and fix organization
      
      * Add tips
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Make tests faster, improve docs
      
      * Fix doc tests
      
      * Add model to toctree
      
      * Add docs
      
      * Add note about creating new checkpoint
      
      * Remove is_decoder
      
      * Make tests smaller, add docs
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      4d367a3c
  2. 11 Oct, 2022 1 commit
  3. 10 Oct, 2022 2 commits
    • Lysandre's avatar
      Dev version · 10100979
      Lysandre authored
      10100979
    • amyeroberts's avatar
      Add TF whisper (#19378) · e3f028f3
      amyeroberts authored
      
      
      * simplify loop
      
      * add featur extractor
      
      * add model
      
      * start conversion
      
      * add dropout
      
      * initial commit of test files
      
      * copnversion for all models
      
      * update processor for correct padding
      
      * update feature extraction
      
      * update integration test logits match
      
      * fmnt: off for the logits
      
      * on the fly mel bank
      
      * small nit
      
      * update test
      
      * update tokenizer
      
      * nit feature extraction
      
      * update
      
      * update tokenizer test
      
      * adds logit processor and update tokenizer to get supress tokens
      
      * style
      
      * clean convert
      
      * revert to original modeling tf utils
      
      * Update
      
      * update
      
      * nit
      
      * clean convert file
      
      * update tests and nits
      
      * quality
      
      * slow generation test
      
      * ffn_dim to allow customization
      
      * update readme
      
      * add to toctreee
      
      * start fixing integration tests
      
      * update tests and code
      
      * fix feature extractor
      
      * fix config tests common
      
      * update code to fix tests
      
      * fix feature exctractor
      
      * nit feature extraction
      
      * update test for new feature extractor
      
      * style
      
      * add absrtact
      
      * large logits wioth custom decoder input ids
      
      * wraap around is otrch available
      
      * fix feature extractor
      
      * correct logits for whisper small.en
      
      * nit
      
      * fix encoder_attentino_mask
      
      * some fixes
      
      * remove unnecessary inputs
      
      * nits
      
      * add normalizer file
      
      * update etst tokenization
      
      * fix attention mask not defined
      
      * fix generate
      
      * remove uncoder attention mask useless
      
      * update test modeling whisper
      
      * update condfig to add second non supress tokens
      
      * nits on feature exrtactor
      
      * nit for test tokenizers
      
      * update etsts
      
      * update tests
      
      * update tokenization test
      
      * fixup
      
      * invalidated hf token. Clean convert openai to whisper
      
      * fix logit tests
      
      * fixup
      
      * Add model to README
      
      * Fix doc tests
      
      * clean merge
      
      * revert toc_tree changes
      
      * remove useless LogitProcessor
      
      * Update whisper .mdx
      
      * update config file doc
      
      * update configuration docstring
      
      * update test tokenization
      
      * update test tokenization
      
      * update tokenization whisper
      Added copied from where needed
      
      * update feature extraction
      
      * nit test name
      
      * style
      
      * quality
      
      * remove get suppress tokens and update non_speech tokens global variables
      
      * Update src/transformers/models/whisper/feature_extraction_whisper.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * clean modeling whisper and test
      Removed the attention mask arguments that are deprecated
      
      * fix large test
      
      * Add multilingual audio test, and translate test
      
      * style
      
      * fix larg multilingual test
      
      * nits
      
      * add copied from for attention layer
      
      * remove attention masks in doc
      
      * add english normalizer
      
      * Update docs/source/en/model_doc/whisper.mdx
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * update tokenization test
      
      * remove copied from in whisper attention : no bias in k_proj only
      
      * wrap around dependencies in english normalizer
      
      * style
      
      * correct import generation logits
      
      * for now, wrap feature extractor with torch
      
      * remove torch depencies for feature extraction and style
      
      * Update src/transformers/models/whisper/convert_openai_whisper_to_tfms.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/whisper/configuration_whisper.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update docs/source/en/model_doc/whisper.mdx
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * fixup
      
      * nit
      
      * update logitds
      
      * style
      
      * nit
      
      * nits and fix final tests
      
      * add `is_more_itertools_available` to utils
      
      * quality
      
      * add begin supress tokens, supress tokens to generate args and config
      
      * clean supressTokensLogitProcessor in generation logits
      
      * Nit naming
      
      * add supressTokensAtBegin
      
      * udpate tests, supress tokens to None or correct values
      
      * nit and style
      
      * update RAG to fit test and generate_logit
      
      * add copy pasted statment on english normalizer
      
      * add arguments to config_common_kwargs
      
      * Update src/transformers/generation_utils.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/generation_logits_process.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * revert changes based on reviews
      
      * update doc and nits
      
      * Update src/transformers/models/whisper/configuration_whisper.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * more nits
      
      * last nits
      
      * update test configuration common
      
      * add BART name in decoder attention mask documentation
      
      * Update src/transformers/models/whisper/modeling_whisper.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * style
      
      * nit
      
      * nit
      
      * add english.json file to git
      
      * nits on documentation
      
      * nit
      
      * nits
      
      * last styling
      
      * add main toctree file
      
      * remove sentence piece dependency
      
      * clean init file
      
      * fix tokenizer that has no dependencies on sentencepiece
      
      * update whisper init file, nit
      
      * remove english.json file
      
      * add get decoder prompt id
      
      * All weights loading
      
      * Remove hanging pdb
      
      * Fixup and tidy up
      
      * Use same copied from as PT model
      
      * Remove whitespace changes
      
      * Remove torch references
      
      * Tie embeddings
      
      * Remove logits processor input to generate
      
      * Update logit values
      
      * revert changes and add forced logit processor
      
      * nit
      
      * clean normalizer
      
      * remove protected
      
      * Add logit processors and update generation code & tests
      
      * Some tidy up
      
      * Update docstring
      
      * update
      
      * update based on review
      
      * Update src/transformers/models/whisper/configuration_whisper.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/whisper/configuration_whisper.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update to reflect changes on the PT model branch
      
      * Tidy up
      
      * Remove extra whitespace
      
      * Fix test - make input ids small enough we can append
      
      * Include upstream changes on main
      
      * PR comments - add batch tests, remove comments & defaults
      
      * Fix model output imports
      
      * Update src/transformers/models/whisper/modeling_tf_whisper.py
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      
      * Update src/transformers/generation_tf_logits_process.py
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      
      * Update src/transformers/models/whisper/modeling_tf_whisper.py
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      
      * Update src/transformers/models/whisper/modeling_tf_whisper.py
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      
      * Update tests/models/whisper/test_modeling_tf_whisper.py
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      
      * Update src/transformers/models/whisper/modeling_tf_whisper.py
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      
      * Update src/transformers/models/whisper/modeling_tf_whisper.py
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      
      * Update docstring example
      
      * Update src/transformers/models/whisper/modeling_tf_whisper.py
      Co-authored-by: default avatarMatt <Rocketknight1@users.noreply.github.com>
      
      * Remove changes to adjust_logits_during_generation function
      
      * Update src/transformers/models/whisper/modeling_tf_whisper.py
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * Tidy up imports that don't require TF
      
      * Update tests - skip and no more skip
      
      * Update tests/generation/test_generation_tf_logits_process.py
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      
      * Update src/transformers/models/whisper/modeling_tf_whisper.py
      
      * Update src/transformers/models/whisper/modeling_tf_whisper.py
      Co-authored-by: default avatarMatt <Rocketknight1@users.noreply.github.com>
      
      * Add training flags
      
      * Add (skipped) XLA generation tests
      
      * Add embedding correctness test
      
      * Add constant ids for generation tests
      
      * Make logits finding a bit tidier
      
      * Remove unused args
      
      * xla generation enabled
      
      * Don't skip XLA tests anymore
      
      * Fix tests - add position ids to expected signature and update rag generation
      
      * Undo method reorder
      
      * Remove added whitespace
      
      * Remove copy-paste gradient checkopint ref
      
      * Remove
      
      * Trigger CI - (issue with refs when pulling)
      Co-authored-by: default avatarArthur Zucker <arthur.zucker@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarNielsRogge <niels.rogge1@gmail.com>
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
      Co-authored-by: default avatarMatt <Rocketknight1@users.noreply.github.com>
      Co-authored-by: default avatarJoao Gante <joao@huggingface.co>
      e3f028f3
  4. 05 Oct, 2022 1 commit
    • Arthur's avatar
      Add WhisperModel to transformers (#19166) · 45e14038
      Arthur authored
      
      
      * simplify loop
      
      * add featur extractor
      
      * add model
      
      * start conversion
      
      * add dropout
      
      * initial commit of test files
      
      * copnversion for all models
      
      * update processor for correct padding
      
      * update feature extraction
      
      * update integration test logits match
      
      * fmnt: off for the logits
      
      * on the fly mel bank
      
      * small nit
      
      * update test
      
      * update tokenizer
      
      * nit feature extraction
      
      * update
      
      * update tokenizer test
      
      * adds logit processor and update tokenizer to get supress tokens
      
      * style
      
      * clean convert
      
      * revert to original modeling tf utils
      
      * Update
      
      * update
      
      * nit
      
      * clean convert file
      
      * update tests and nits
      
      * quality
      
      * slow generation test
      
      * ffn_dim to allow customization
      
      * update readme
      
      * add to toctreee
      
      * start fixing integration tests
      
      * update tests and code
      
      * fix feature extractor
      
      * fix config tests common
      
      * update code to fix tests
      
      * fix feature exctractor
      
      * nit feature extraction
      
      * update test for new feature extractor
      
      * style
      
      * add absrtact
      
      * large logits wioth custom decoder input ids
      
      * wraap around is otrch available
      
      * fix feature extractor
      
      * correct logits for whisper small.en
      
      * nit
      
      * fix encoder_attentino_mask
      
      * some fixes
      
      * remove unnecessary inputs
      
      * nits
      
      * add normalizer file
      
      * update etst tokenization
      
      * fix attention mask not defined
      
      * Add model to README
      
      * Fix doc tests
      
      * fix generate
      
      * remove uncoder attention mask useless
      
      * update test modeling whisper
      
      * update condfig to add second non supress tokens
      
      * nits on feature exrtactor
      
      * nit for test tokenizers
      
      * update etsts
      
      * update tests
      
      * update tokenization test
      
      * fixup
      
      * invalidated hf token. Clean convert openai to whisper
      
      * fix logit tests
      
      * fixup
      
      * clean merge
      
      * revert toc_tree changes
      
      * remove useless LogitProcessor
      
      * Update whisper .mdx
      
      * update config file doc
      
      * update configuration docstring
      
      * update test tokenization
      
      * update test tokenization
      
      * update tokenization whisper
      Added copied from where needed
      
      * update feature extraction
      
      * nit test name
      
      * style
      
      * quality
      
      * remove get suppress tokens and update non_speech tokens global variables
      
      * Update src/transformers/models/whisper/feature_extraction_whisper.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * clean modeling whisper and test
      Removed the attention mask arguments that are deprecated
      
      * fix large test
      
      * Add multilingual audio test, and translate test
      
      * style
      
      * fix larg multilingual test
      
      * nits
      
      * Update docs/source/en/model_doc/whisper.mdx
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * add copied from for attention layer
      
      * remove attention masks in doc
      
      * add english normalizer
      
      * update tokenization test
      
      * remove copied from in whisper attention : no bias in k_proj only
      
      * wrap around dependencies in english normalizer
      
      * style
      
      * correct import generation logits
      
      * for now, wrap feature extractor with torch
      
      * Update src/transformers/models/whisper/convert_openai_whisper_to_tfms.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/whisper/configuration_whisper.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update docs/source/en/model_doc/whisper.mdx
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * remove torch depencies for feature extraction and style
      
      * fixup
      
      * nit
      
      * update logitds
      
      * style
      
      * nit
      
      * nits and fix final tests
      
      * add `is_more_itertools_available` to utils
      
      * quality
      
      * add begin supress tokens, supress tokens to generate args and config
      
      * clean supressTokensLogitProcessor in generation logits
      
      * Nit naming
      
      * add supressTokensAtBegin
      
      * udpate tests, supress tokens to None or correct values
      
      * nit and style
      
      * update RAG to fit test and generate_logit
      
      * add copy pasted statment on english normalizer
      
      * add arguments to config_common_kwargs
      
      * Update src/transformers/generation_utils.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/generation_logits_process.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/whisper/configuration_whisper.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * revert changes based on reviews
      
      * update doc and nits
      
      * more nits
      
      * last nits
      
      * update test configuration common
      
      * add BART name in decoder attention mask documentation
      
      * Update src/transformers/models/whisper/modeling_whisper.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * style
      
      * nit
      
      * nit
      
      * add english.json file to git
      
      * nits on documentation
      
      * nit
      
      * nits
      
      * last styling
      
      * add main toctree file
      
      * remove sentence piece dependency
      
      * clean init file
      
      * fix tokenizer that has no dependencies on sentencepiece
      
      * update whisper init file, nit
      
      * remove english.json file
      
      * add get decoder prompt id
      
      * revert changes and add forced logit processor
      
      * nit
      
      * clean normalizer
      
      * remove protected
      
      * update
      
      * Update src/transformers/models/whisper/configuration_whisper.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * update based on review
      
      * Update src/transformers/models/whisper/configuration_whisper.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * add batched tests
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarNielsRogge <niels.rogge1@gmail.com>
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      45e14038
  5. 04 Oct, 2022 1 commit
  6. 30 Sep, 2022 3 commits
    • Kashif Rasul's avatar
      time series forecasting model (#17965) · 5cd16f01
      Kashif Rasul authored
      
      
      * initial files
      
      * initial model via cli
      
      * typos
      
      * make a start on the model config
      
      * ready with configuation
      
      * remove tokenizer ref.
      
      * init the transformer
      
      * added initial model forward to return dec_output
      
      * require gluonts
      
      * update dep. ver table and add as extra
      
      * fixed typo
      
      * add type for prediction_length
      
      * use num_time_features
      
      * use config
      
      * more config
      
      * typos
      
      * opps another typo
      
      * freq can be none
      
      * default via transformation is 1
      
      * initial transformations
      
      * fix imports
      
      * added transform_start_field
      
      * add helper to create pytorch dataloader
      
      * added inital val and test data loader
      
      * added initial distr head and loss
      
      * training working
      
      * remove TimeSeriesTransformerTokenizer
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/__init__.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/time_series_transformer/__init__.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * fixed copyright
      
      * removed docs
      
      * remove time series tokenizer
      
      * fixed docs
      
      * fix text
      
      * fix second
      
      * fix default
      
      * fix order
      
      * use config directly
      
      * undo change
      
      * fix comment
      
      * fix year
      
      * fix import
      
      * add additional arguments for training vs. test
      
      * initial greedy inference loop
      
      * fix inference
      
      * comment out token inputs to enc dec
      
      * Use HF encoder/decoder
      
      * fix inference
      
      * Use Seq2SeqTSModelOutput output
      
      * return Seq2SeqTSPredictionOutput
      
      * added default arguments
      
      * fix return_dict true
      
      * scale is a tensor
      
      * output static_features for inference
      
      * clean up some unused bits
      
      * fixed typo
      
      * set return_dict if none
      
      * call model once for both train/predict
      
      * use cache if future_target is none
      
      * initial generate func
      
      * generate arguments
      
      * future_time_feat is required
      
      * return SampleTSPredictionOutput
      
      * removed unneeded classes
      
      * fix when params is none
      
      * fix return dict
      
      * fix num_attention_heads
      
      * fix arguments
      
      * remove unused shift_tokens_right
      
      * add different dropout configs
      
      * implement FeatureEmbedder, Scaler and weighted_average
      
      * remove gluonts dependency
      
      * fix class names
      
      * avoid _variable names
      
      * remove gluonts dependency
      
      * fix imports
      
      * remove gluonts from configuration
      
      * fix docs
      
      * fixed typo
      
      * move utils to examples
      
      * add example requirements
      
      * config has no freq
      
      * initial run_ts_no_trainer
      
      * remove from ignore
      
      * fix output_attentions and removed unsued getters/setters
      
      * removed unsed tests
      
      * add dec seq len
      
      * add test_attention_outputs
      
      * set has_text_modality=False
      
      * add config attribute_map
      
      * make style
      
      * make fix-copies
      
      * add encoder_outputs to TimeSeriesTransformerForPrediction forward
      
      * Improve docs, add model to README
      
      * added test_forward_signature
      
      * More improvements
      
      * Add more copied from
      
      * Fix README
      
      * Fix remaining quality issues
      
      * updated encoder and decoder
      
      * fix generate
      
      * output_hidden_states and use_cache are optional
      
      * past key_values returned too
      
      * initialize weights of distribution_output module
      
      * fixed more tests
      
      * update test_forward_signature
      
      * fix return_dict outputs
      
      * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * removed commented out tests
      
      * added neg. bin and normal output
      
      * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * move to one line
      
      * Add docstrings
      
      * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * add try except for assert and raise
      
      * try and raise exception
      
      * fix the documentation formatting
      
      * fix assert call
      
      * fix docstring formatting
      
      * removed input_ids from DOCSTRING
      
      * Update input docstring
      
      * Improve variable names
      
      * Update order of inputs
      
      * Improve configuration
      
      * Improve variable names
      
      * Improve docs
      
      * Remove key_length from tests
      
      * Add extra docs
      
      * initial unittests
      
      * added test_inference_no_head test
      
      * added test_inference_head
      
      * add test_seq_to_seq_generation
      
      * make style
      
      * one line
      
      * assert mean prediction
      
      * removed comments
      
      * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * fix order of args
      
      * make past_observed_mask optional as well
      
      * added Amazon license header
      
      * updated utils with new fieldnames
      
      * make style
      
      * cleanup
      
      * undo position of past_observed_mask
      
      * fix import
      
      * typo
      
      * more typo
      
      * rename example files
      
      * remove example for now
      
      * Update docs/source/en/_toctree.yml
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update modeling_time_series_transformer.py
      
      fix style
      
      * fixed typo
      
      * fix typo and grammer
      
      * fix style
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      Co-authored-by: default avatarNielsRogge <niels.rogge1@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      5cd16f01
    • Matt's avatar
      Rebase ESM PR and update all file formats (#19055) · 368b649a
      Matt authored
      
      
      * Rebase ESM PR and update all file formats
      
      * Fix test relative imports
      
      * Add __init__.py to the test dir
      
      * Disable gradient checkpointing
      
      * Remove references to TFESM... FOR NOW >:|
      
      * Remove completed TODOs from tests
      
      * Convert docstrings to mdx, fix-copies from BERT
      
      * fix-copies for the README and index
      
      * Update ESM's __init__.py to the modern format
      
      * Add to _toctree.yml
      
      * Ensure we correctly copy the pad_token_id from the original ESM model
      
      * Ensure we correctly copy the pad_token_id from the original ESM model
      
      * Tiny grammar nitpicks
      
      * Make the layer norm after embeddings an optional flag
      
      * Make the layer norm after embeddings an optional flag
      
      * Update the conversion script to handle other model classes
      
      * Remove token_type_ids entirely, fix attention_masking and add checks to convert_esm.py
      
      * Break the copied from link from BertModel.forward to remove token_type_ids
      
      * Remove debug array saves
      
      * Begin ESM-2 porting
      
      * Add a hacky workaround for the precision issue in original repo
      
      * Code cleanup
      
      * Remove unused checkpoint conversion code
      
      * Remove unused checkpoint conversion code
      
      * Fix copyright notices
      
      * Get rid of all references to the TF weights conversion
      
      * Remove token_type_ids from the tests
      
      * Fix test code
      
      * Update src/transformers/__init__.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/__init__.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update README.md
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Add credit
      
      * Remove _ args and __ kwargs in rotary embedding
      
      * Assertively remove asserts
      
      * Replace einsum with torch.outer()
      
      * Fix docstring formatting
      
      * Remove assertions in tokenization
      
      * Add paper citation to ESMModel docstring
      
      * Move vocab list to single line
      
      * Remove ESMLayer from init
      
      * Add Facebook copyrights
      
      * Clean up RotaryEmbedding docstring
      
      * Fix docstring formatting
      
      * Fix docstring for config object
      
      * Add explanation for new config methods
      
      * make fix-copies
      
      * Rename all the ESM- classes to Esm-
      
      * Update conversion script to allow pushing to hub
      
      * Update tests to point at my repo for now
      
      * Set config properly for tests
      
      * Remove the gross hack that forced loss of precision in inv_freq and instead copy the data from the model being converted
      
      * make fixup
      
      * Update expected values for slow tests
      
      * make fixup
      
      * Remove EsmForCausalLM for now
      
      * Remove EsmForCausalLM for now
      
      * Fix padding idx test
      
      * Updated README and docs with ESM-1b and ESM-2 separately (#19221)
      
      * Updated README and docs with ESM-1b and ESM-2 separately
      
      * Update READMEs, longer entry with 3 citations
      
      * make fix-copies
      Co-authored-by: default avatarYour Name <you@example.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarTom Sercu <tsercu@fb.com>
      Co-authored-by: default avatarYour Name <you@example.com>
      368b649a
    • NielsRogge's avatar
      Add MarkupLM (#19198) · f3d2f7a6
      NielsRogge authored
      
      
      * First draft
      
      * Make basic test work
      
      * Fix most tokenizer tests
      
      * More improvements
      
      * Make more tests pass
      
      * Fix more tests
      
      * Fix some code quality
      
      * Improve truncation
      
      * Implement feature extractor
      
      * Improve feature extractor and add tests
      
      * Improve feature extractor tests
      
      * Fix pair_input test partly
      
      * Add fast tokenizer
      
      * Improve implementation
      
      * Fix rebase
      
      * Fix rebase
      
      * Fix most of the tokenizer tests.
      
      * propose solution for fast
      
      * add: integration test for fasttokenizer, warning for decode, fix template in slow tokenizer
      
      * add: modify markuplmconverter
      
      * add: some modify on converter and tokenizerfast
      
      * Fix style, copies
      
      * Make fixup
      
      * Update tokenization_markuplm.py
      
      * Update test_tokenization_markuplm.py
      
      * Update markuplm related
      
      * Improve processor, add integration test
      
      * Add processor test file
      
      * Improve processor
      
      * Improve processor tests
      
      * Fix more processor tests
      
      * Fix processor tests
      
      * Update docstrings
      
      * Add Copied from statements
      
      * Add more Copied from statements
      
      * Add code examples
      
      * Improve code examples
      
      * Add model to doc tests
      
      * Adding dependency check
      
      * Add dummy file
      
      * Add requires_backends
      
      * Add model to toctree
      
      * Fix more things, disable dependency check for now
      
      * Apply more suggestions
      
      * Add soft dependency
      
      * Add annotators to tests
      
      * Fix style
      
      * Remove from_slow=True
      
      * Remove print statements
      
      * Add sanity check
      
      * Fix processor test
      
      * Fix processor tests, add more docs
      
      * Add doc tests for mdx file
      
      * Add more tips
      
      * Apply suggestions
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      Co-authored-by: default avatarlockon-n <45759388+lockon-n@users.noreply.github.com>
      Co-authored-by: default avatarSaulLu <lucilesaul.com@gmail.com>
      Co-authored-by: default avatarlockon-n <dd098309@126.com>
      f3d2f7a6
  7. 22 Sep, 2022 2 commits
  8. 16 Sep, 2022 1 commit
  9. 14 Sep, 2022 3 commits
    • Lysandre's avatar
      Dev version · 16913b3c
      Lysandre authored
      16913b3c
    • Shinya Otani's avatar
      Add support for Japanese GPT-NeoX-based model by ABEJA, Inc. (#18814) · f5f430e5
      Shinya Otani authored
      * add gpt-neox-japanese model and tokenizer as new model
      
      * Correction to PR's comment for GPT NeoX Japanese
      - Fix to be able to use gpu
      - Add comment # Copied... at the top of RotaryEmbedding
      - Implement nn.Linear instead of original linear class
      - Add generation test under @slow
      
      * fix bias treatment for gpt-neox-japanese
      
      * Modidy gpt-neox-japanese following PR
      - add doc for bias_dropout_add
      - style change following a PR comment
      
      * add document for gpt-neox-japanese
      
      * remove unused import from gpt-neox-japanese
      
      * fix README for gpt-neox-japanese
      f5f430e5
    • NielsRogge's avatar
      Add Deformable DETR (#17281) · 59407bbe
      NielsRogge authored
      
      
      * First draft
      
      * More improvements
      
      * Improve model, add custom CUDA code
      
      * Import torch before
      
      * Add script that imports custom layer
      
      * Add everything in new ops directory
      
      * Import custom layer in modeling file
      
      * Fix ARCHIVE_MAP typo
      
      * Creating the custom kernel on the fly.
      
      * Import custom layer in modeling file
      
      * More improvements
      
      * Fix CUDA loading
      
      * More improvements
      
      * Improve conversion script
      
      * Improve conversion script
      
      * Make it work until encoder_outputs
      
      * Make forward pass work
      
      * More improvements
      
      * Make logits match original implementation
      
      * Make implementation also support single_scale model
      
      * Add support for single_scale and dilation checkpoint
      
      * Add support for with_box_refine model
      
      * Support also two stage model
      
      * Improve tests
      
      * Fix more tests
      
      * Make more tests pass
      
      * Upload all models to the hub
      
      * Clean up some code
      
      * Improve decoder outputs
      
      * Rename intermediate hidden states and reference points
      
      * Improve model outputs
      
      * Move tests to dedicated folder
      
      * Improve model outputs
      
      * Fix retain_grad test
      
      * Improve docs
      
      * Clean up and make test_initialization pass
      
      * Improve variable names
      
      * Add copied from statements
      
      * Improve docs
      
      * Fix style
      
      * Improve docs
      
      * Improve docs, move tests to model folder
      
      * Fix rebase
      
      * Remove DetrForSegmentation from auto mapping
      
      * Apply suggestions from code review
      
      * Improve variable names and docstrings
      
      * Apply some more suggestions from code review
      
      * Apply suggestion from code review
      
      * better docs and variables names
      
      * hint to num_queries and two_stage confusion
      
      * remove asserts and code refactor
      
      * add exception if two_stage is True and with_box_refine is False
      
      * use f-strings
      
      * Improve docs and variable names
      
      * Fix code quality
      
      * Fix rebase
      
      * Add require_torch_gpu decorator
      
      * Add pip install ninja to CI jobs
      
      * Apply suggestion of @sgugger
      
      * Remove DeformableDetrForObjectDetection from auto mapping
      
      * Remove DeformableDetrModel from auto mapping
      
      * Add model to toctree
      
      * Add model back to mappings, skip model in pipeline tests
      
      * Apply @sgugger's suggestion
      
      * Fix imports in the init
      
      * Fix copies
      
      * Add CPU implementation
      
      * Comment out GPU function
      
      * Undo previous change
      
      * Apply more suggestions
      
      * Remove require_torch_gpu annotator
      
      * Fix quality
      
      * Add logger.info
      
      * Fix logger
      
      * Fix variable names
      
      * Fix initializaztion
      
      * Add missing initialization
      
      * Update checkpoint name
      
      * Add model to doc tests
      
      * Add CPU/GPU equivalence test
      
      * Add Deformable DETR to pipeline tests
      
      * Skip model for object detection pipeline
      Co-authored-by: default avatarNicolas Patry <patry.nicolas@protonmail.com>
      Co-authored-by: default avatarNouamane Tazi <nouamane98@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <Sylvain.gugger@gmail.com>
      59407bbe
  10. 09 Sep, 2022 1 commit
  11. 08 Sep, 2022 2 commits
    • NielsRogge's avatar
      Add X-CLIP (#18852) · bb6f6d53
      NielsRogge authored
      * First draft
      
      * Improve conversion script
      
      * Make vision encoder work
      
      * More improvements
      
      * Improve conversion script
      
      * Fix quality
      
      * Add MultiframeIntegrationTransformer
      
      * More improvements
      
      * Make MiT output work
      
      * Fix quality
      
      * Add prompts generator
      
      * Add tests
      
      * Fix some tests
      
      * Fix some more tests
      
      * Fix more tests
      
      * Improve conversion script
      
      * Fix model outputs
      
      * Fix more tests
      
      * Add XClipProcessor
      
      * Use processor in conversion script
      
      * Fix integration test
      
      * Update README, fix docs
      
      * Fix all tests
      
      * Add MIT output to XClipOutput
      
      * Create better variable names
      
      * Rename XClip to XCLIP
      
      * Extend conversion script
      
      * Add support for large models
      
      * Add support for 16 frame models
      
      * Add another model'
      
      * Fix module issue
      
      * Apply suggestions from code review
      
      * Add figure to docs
      
      * Fix CLIPProcessor issue
      
      * Apply suggestions from code review
      
      * Delete file
      
      * Convert more checkpoints
      
      * Convert last checkpoint
      
      * Update nielsr to microsoft
      bb6f6d53
    • Devlee247's avatar
      Fix LayoutXLM wrong link in README (#18932) · 9832ac7c
      Devlee247 authored
      * fix LayoutXLM wrong link in README
      
      * fix LayoutXLM worng link in index.mdx
      9832ac7c
  12. 02 Sep, 2022 1 commit
    • Jason Phang's avatar
      PEGASUS-X (#18551) · 53e33e6f
      Jason Phang authored
      * PegasusX Initial commit
      
      * rename
      
      * pegasus X implementation
      
      * pegx update
      
      * pegx fix
      
      * pegasus-x fixes
      
      * pegx updates
      
      * cleanup
      
      * cleanup
      
      * cleanup
      
      * tests
      
      * stylefixes
      
      * Documentation update
      
      * Model hub fix
      
      * cleanup
      
      * update
      
      * update
      
      * testfix
      
      * Check fix
      
      * tweaks for merging
      
      * style
      
      * style
      
      * updates for pr
      
      * style
      
      * change pegasus-x repo
      53e33e6f
  13. 29 Aug, 2022 1 commit
  14. 12 Aug, 2022 1 commit
    • NielsRogge's avatar
      Add Donut (#18488) · 2ab790e8
      NielsRogge authored
      
      
      * First draft
      
      * Improve script
      
      * Update script
      
      * Make conversion work
      
      * Add final_layer_norm attribute to Swin's config
      
      * Add DonutProcessor
      
      * Convert more models
      
      * Improve feature extractor and convert base models
      
      * Fix bug
      
      * Improve integration tests
      
      * Improve integration tests and add model to README
      
      * Add doc test
      
      * Add feature extractor to docs
      
      * Fix integration tests
      
      * Remove register_buffer
      
      * Fix toctree and add missing attribute
      
      * Add DonutSwin
      
      * Make conversion script work
      
      * Improve conversion script
      
      * Address comment
      
      * Fix bug
      
      * Fix another bug
      
      * Remove deprecated method from docs
      
      * Make Swin and Swinv2 untouched
      
      * Fix code examples
      
      * Fix processor
      
      * Update model_type to donut-swin
      
      * Add feature extractor tests, add token2json method, improve feature extractor
      
      * Fix failing tests, remove integration test
      
      * Add do_thumbnail for consistency
      
      * Improve code examples
      
      * Add code example for document parsing
      
      * Add DonutSwin to MODEL_NAMES_MAPPING
      
      * Add model to appropriate place in toctree
      
      * Update namespace to appropriate organization
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      2ab790e8
  15. 06 Aug, 2022 1 commit
  16. 04 Aug, 2022 1 commit
    • NielsRogge's avatar
      Add VideoMAE (#17821) · f9a0008d
      NielsRogge authored
      
      
      * First draft
      
      * Add VideoMAEForVideoClassification
      
      * Improve conversion script
      
      * Add VideoMAEForPreTraining
      
      * Add VideoMAEFeatureExtractor
      
      * Improve VideoMAEFeatureExtractor
      
      * Improve docs
      
      * Add first draft of model tests
      
      * Improve VideoMAEForPreTraining
      
      * Fix base_model_prefix
      
      * Make model take pixel_values of shape (B, T, C, H, W)
      
      * Add loss computation of VideoMAEForPreTraining
      
      * Improve tests
      
      * Improve model tests茅
      
      * Make all tests pass
      
      * Add VideoMAE to main README
      
      * Add tests for VideoMAEFeatureExtractor
      
      * Add integration test
      
      * Improve conversion script
      
      * Rename patch embedding class
      
      * Remove VideoMAELayer from init
      
      * Update design of patch embeddings
      
      * Improve comments
      
      * Improve conversion script
      
      * Improve conversion script
      
      * Add conversion of pretrained model
      
      * Add loss verification of pretrained model
      
      * Add loss verification of unnormalized targets
      
      * Add integration test for pretraining model
      
      * Apply suggestions from code review
      
      * Fix bug to make feature extractor resize only shorter edge
      
      * Address more comments
      
      * Improve normalization of videos
      
      * Add doc examples
      
      * Move constants to dedicated script
      
      * Remove scripts
      
      * Transfer checkpoints, fix docs
      
      * Update script
      
      * Update image mean and std
      
      * Fix doc tests
      
      * Set return_tensors to NumPy by default
      
      * Revert the previous change
      Co-authored-by: default avatarNiels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
      f9a0008d
  17. 27 Jul, 2022 2 commits
    • Ritik Nandwal's avatar
      Add swin transformer v2 (#17469) · e87ac9d1
      Ritik Nandwal authored
      
      
      * Add files generated using transformer-cli add-new-model-like command
      
      * Add changes for swinv2 attention and forward method
      
      * Add fixes
      
      * Add modifications for weight conversion and remaining args in swin model
      
      * Add changes for patchmerging
      
      * Add changes for SwinV2selfattention
      
      * Update conversion script
      
      * Add final fixes for the swin_v2 model
      
      * Add changes for conversion script for pretrained window size case
      
      * Add pretrained window size value from config in SwinV2Encoder class
      
      * Make fixup
      
      * Add swinv2 to models_not_in_readme to utils/check_copies.py
      
      * Modify Swinv2v2 to Swin Transformer V2
      
      * Remove copied from, to run make fixup command
      
      * Add updates to swinv2tf from main branch
      
      * Add pretrained_window_size to config, to make tests pass
      
      * Add modified weights from nandwalritik profile for swinv2
      
      * Update model weights from swinv2 from nandwalritik profile
      
      * Add fix for build_pr_documentation CI fix
      
      * Add fixes for weight conversion
      
      * Add change to make input with padding work
      
      * Add fixes for test cases
      
      * Add few changes from swin to swinv2 to pass test cases
      
      * Remove tests for tensorflow as swinv2 for TF is not added yet
      
      * Overide test_pt_tf_model_equivalence function as TF implementation for swinv2 is not added yet
      
      * Add modeling_tf_swinv2 to _ignore_modules as test file is removed for this one right now.
      
      * Update docs url for swinv2 in README.md
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Undo changes for check_repo
      
      * Update url in readme.md
      
      * Remove overrided function to test pt_tf_model_equivalence
      
      * Remove TF model imports for Swinv2 as its not implemented in this PR
      
      * Add changes for index.mdx
      
      * Add swinv2 papers link,abstract and contributors details
      
      * Rename cpb_mlp to continous_position_bias_mlp
      
      * Add tips for swinv2 model
      
      * Update src/transformers/models/swinv2/configuration_swinv2.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/swinv2/configuration_swinv2.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Fix indentation for docstring example in src/transformers/models/swinv2/configuration_swinv2.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update import order in src/transformers/models/swinv2/configuration_swinv2.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Add copyright statements in weights conversion script.
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Remove Swinv2 from models_not_in_readme
      
      * Reformat code
      
      * Remove TF implementation file for swinv2
      
      * Update start docstring.
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Add changes for docstring
      
      * Update orgname for weights to microsoft
      
      * Remove to_2tuple function
      
      * Add copied from statements wherever applicable
      
      * Add copied from to Swinv2ForMaskedImageModelling class
      
      * Reformat code.
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Add unittest.skip(with reason.) for test_inputs_embeds test case.
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Add updates for test_modeling_swinv2.py
      
      * Add @unittest.skip() annotation for clarity to create_and_test_config_common_properties function
      
      * Add continuous_position_bias_mlp parameter to conversion script
      
      * Add test for testing masked_image_modelling for swinv2
      
      * Update Swinv2 to Swin Transformer v2 in docs/source/en/model_doc/swinv2.mdx
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update Swinv2 to Swin Transformer v2 in docs/source/en/model_doc/swinv2.mdx
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update docs/source/en/model_doc/swinv2.mdx
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update docs/source/en/model_doc/swinv2.mdx
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Add suggested changes
      
      * Add copied from to forward methods of Swinv2Stage and Swinv2Encoder
      
      * Add push_to_hub flag to weight conversion script
      
      * Change order or Swinv2DropPath class
      
      * Add id2label mapping for imagenet 21k
      
      * Add updated url for SwinV2 functions and classes used in implementation
      
      * Update input_feature dimensions format, mentioned in comments.
      Co-authored-by: default avatarAlara Dirik <8944735+alaradirik@users.noreply.github.com>
      
      * Add suggested changes for modeling_swin2.py
      
      * Update docs
      
      * Remove create_and_test_config_common_properties function, as test_model_common_attributes is sufficient.
      
      * Fix indentation.
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Add changes for making Nit objects in code style
      
      * Add suggested changes
      
      * Add suggested changes for test_modelling_swinv2
      
      * make fix-copies
      
      * Update docs/source/en/model_doc/swinv2.mdx
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      Co-authored-by: default avatarAlara Dirik <8944735+alaradirik@users.noreply.github.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      e87ac9d1
    • Lysandre's avatar
      Dev version · c89a592e
      Lysandre authored
      c89a592e
  18. 22 Jul, 2022 1 commit
    • Alara Dirik's avatar
      Add OWL-ViT model for zero-shot object detection (#17938) · 12d66b47
      Alara Dirik authored
      * add owlvit model skeleton
      
      * add class and box predictor heads
      
      * convert modified flax clip to pytorch
      
      * fix box and class predictors
      
      * add OwlViTImageTextEmbedder
      
      * convert class and box head checkpoints
      
      * convert image text embedder checkpoints
      
      * add object detection head
      
      * fix bugs
      
      * update conversion script
      
      * update conversion script
      
      * fix q,v,k,out weight conversion conversion
      
      * add owlvit object detection output
      
      * fix bug in image embedder
      
      * fix bugs in text embedder
      
      * fix positional embeddings
      
      * fix bug in inference mode vision pooling
      
      * update docs, init tokenizer and processor files
      
      * support batch processing
      
      * add OwlViTProcessor
      
      * remove merge conflicts
      
      * readd owlvit imports
      
      * fix bug in OwlViTProcessor imports
      
      * fix bugs in processor
      
      * update docs
      
      * fix bugs in processor
      
      * update owlvit docs
      
      * add OwlViTFeatureExtractor
      
      * style changes, add postprocess method to feature extractor
      
      * add feature extractor and processor tests
      
      * add object detection tests
      
      * update conversion script
      
      * update config paths
      
      * update config paths
      
      * fix configuration paths and bugs
      
      * fix bugs in OwlViT tests
      
      * add import checks to processor
      
      * fix docs and minor issues
      
      * fix docs and minor issues
      
      * fix bugs and issues
      
      * fix bugs and issues
      
      * fix bugs and issues
      
      * fix bugs and issues
      
      * update docs and examples
      
      * fix bugs and issues
      
      * update conversion script, fix positional embeddings
      
      * process 2D input ids, update tests
      
      * fix style and quality issues
      
      * update docs
      
      * update docs and imports
      
      * update OWL-ViT index.md
      
      * fix bug in OwlViT feature ext tests
      
      * fix code examples, return_dict by default
      
      * return_dict by default
      
      * minor fixes, add tests to processor
      
      * small fixes
      
      * add output_attentions arg to main model
      
      * fix bugs
      
      * remove output_hidden_states arg from main model
      
      * update self.config variables
      
      * add option to return last_hidden_states
      
      * fix bug in config variables
      
      * fix copied from statements
      
      * fix small issues and bugs
      
      * fix bugs
      
      * fix bugs, support greyscale images
      
      * run fixup
      
      * update repo name
      
      * merge OwlViTImageTextEmbedder with obj detection head
      
      * fix merge conflict
      
      * fix merge conflict
      
      * make fixup
      
      * fix bugs
      
      * fix bugs
      
      * add additional processor test
      12d66b47
  19. 19 Jul, 2022 2 commits
  20. 18 Jul, 2022 1 commit
  21. 29 Jun, 2022 2 commits
  22. 28 Jun, 2022 1 commit
  23. 24 Jun, 2022 1 commit
    • rooa's avatar
      Add CodeGen model (#17443) · d6b6fb99
      rooa authored
      
      
      * Add CodeGen model
      
      * Add missing key and switch order of super()
      
      * Fix torch.ones init with uint8 instead of bool
      
      * Address comments: copy statements and doc
      
      * update tests
      
      * remove old model parallel
      
      * fix batch gen tests
      
      * fix batch gen test
      
      * update test_gpt2_sample_max_time
      
      * fix codgen test and revert gpt2 test change
      
      * Fix incorrect tie_word_embedding value, typo, URL
      
      * Fix model order in README and styling
      
      * Reorder model list alphabetically
      
      * Set tie_word_embedding to False by default
      
      * Apply suggestions from code review
      
      * Better attn mask name & remove attn masked_bias
      
      * add tokenizer for codegen
      
      * quality
      
      * doc tokenizer
      
      * fix-copies
      
      * add CodeGenTokenizer in converter
      
      * make truncation optional
      
      * add test for truncation
      
      * add copyright
      
      * fix-copies
      
      * fix fast tokenizer decode
      
      * Update src/transformers/models/codegen/tokenization_codegen.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * increase vocab_size in tests
      Co-authored-by: default avatarpatil-suraj <surajp815@gmail.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      d6b6fb99
  24. 23 Jun, 2022 1 commit
  25. 21 Jun, 2022 1 commit
  26. 16 Jun, 2022 1 commit
  27. 15 Jun, 2022 1 commit
  28. 13 Jun, 2022 1 commit
    • Daniel Stancl's avatar
      Add `LongT5` model (#16792) · a72f1c9f
      Daniel Stancl authored
      
      
      * Initial commit
      
      * Make some fixes
      
      * Make PT model full forward pass
      
      * Drop TF & Flax implementation, fix copies etc
      
      * Add Flax model and update some corresponding stuff
      
      * Drop some TF things
      
      * Update config and flax local attn
      
      * Add encoder_attention_type to config
      
      * .
      
      * Update docs
      
      * Do some cleansing
      
      * Fix some issues -> make style; add some docs
      
      * Fix position_bias + mask addition + Update tests
      
      * Fix repo consistency
      
      * Fix model consistency by removing flax operation over attn_mask
      
      * [WIP] Add PT TGlobal LongT5
      
      * .
      
      * [WIP] Add flax tglobal model
      
      * [WIP] Update flax model to use the right attention type in the encoder
      
      * Fix flax tglobal model forward pass
      
      * Make the use of global_relative_attention_bias
      
      * Add test suites for TGlobal model
      
      * Fix minor bugs, clean code
      
      * Fix pt-flax equivalence though not convinced with correctness
      
      * Fix LocalAttn implementation to match the original impl. + update READMEs
      
      * Few updates
      
      * Update: [Flax] improve large model init and loading #16148
      
      * Add ckpt conversion script accoring to #16853 + handle torch device placement
      
      * Minor updates to conversion script.
      
      * Typo: AutoModelForSeq2SeqLM -> FlaxAutoModelForSeq2SeqLM
      
      * gpu support + dtype fix
      
      * Apply some suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * * Remove (de)parallelize stuff
      * Edit shape comments
      * Update README.md
      * make fix-copies
      
      * Remove caching logic for local & tglobal attention
      
      * Apply another batch of suggestions from code review
      
      * Add missing checkpoints
      * Format converting scripts
      * Drop (de)parallelize links from longT5 mdx
      
      * Fix converting script + revert config file change
      
      * Revert "Remove caching logic for local & tglobal attention"
      
      This reverts commit 2a619828f6ddc3e65bd9bb1725a12b77fa883a46.
      
      * Stash caching logic in Flax model
      
      * Make side relative bias used always
      
      * Drop caching logic in PT model
      
      * Return side bias as it was
      
      * Drop all remaining model parallel logic
      
      * Remove clamp statements
      
      * Move test files to the proper place
      
      * Update docs with new version of hf-doc-builder
      
      * Fix test imports
      
      * Make some minor improvements
      
      * Add missing checkpoints to docs
      * Make TGlobal model compatible with torch.onnx.export
      * Replace some np.ndarray with jnp.ndarray
      
      * Fix TGlobal for ONNX conversion + update docs
      
      * fix _make_global_fixed_block_ids and masked neg  value
      
      * update flax model
      
      * style and quality
      
      * fix imports
      
      * remove load_tf_weights_in_longt5 from init and fix copies
      
      * add slow test for TGlobal model
      
      * typo fix
      
      * Drop obsolete is_parallelizable and one warning
      
      * Update __init__ files to fix repo-consistency
      
      * fix pipeline test
      
      * Fix some device placements
      
      * [wip]: Update tests -- need to generate summaries to update expected_summary
      
      * Fix quality
      
      * Update LongT5 model card
      
      * Update (slow) summarization tests
      
      * make style
      
      * rename checkpoitns
      
      * finish
      
      * fix flax tests
      Co-authored-by: default avatarphungvanduy <pvduy23@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarpatil-suraj <surajp815@gmail.com>
      a72f1c9f
  29. 09 Jun, 2022 1 commit
  30. 07 Jun, 2022 1 commit
    • Chan Woo Kim's avatar
      M-CTC-T Model (#16402) · 119e3c0f
      Chan Woo Kim authored
      
      
      * added cbs to notebooks, made copy-paste error fix in generation_utils
      
      * initial push for mctc model
      
      * mctc feature extractor done
      
      * added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly.
      
      * added processor, tokenizer and their tests for MCTC. Have added an MCTC modeling test, adjusting model code accordingly.
      
      * passing attention, now struggling to figure out how attention masks make sense here
      
      * works when excluding attention masks. ask later how one would integrate attention maskshere
      
      * bizarre configuration error (model prefix comes first in config dict json and messes up the order)
      
      * all passing but bizzarre config dict ordering issue when to_dict
      
      * passing all major tests
      
      * feature extraction, processor, tokenizer added & tests passing
      
      * style & consistency & other logistical fixes
      
      * copy paste fix
      
      * model after feature extraction working
      
      * commiting final feature extraction results; need to fix normalization
      
      * feature extraction passing tests; probably should add tests on the specific flashlight-copied functions?
      
      * delete print ; format code a bit
      
      * fixing tests
      
      * passing major tests
      
      * fixing styles
      
      * completed tokenization test with real example; not sure if these values are entirely correct.
      
      * last test fixes from local
      
      * reverting accidentally included custom setup configs
      
      * remove load tf weights; fix config error
      
      * testing couldnt import featureextractor
      
      * fix docs
      
      * fix docs
      
      * resolving comments
      
      * style fixes
      
      * style fixes
      
      * Update to MCTCConv1dSubSampler
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * relposemb fixes
      
      * conv1d name issue; expecting config fail with paraentheses
      
      * fix config issue
      
      * fix config issue
      
      * fix config issue
      
      * change everything to MCTCT
      
      * fixing naming change errors
      
      * archive list
      
      * copyrights and docs
      
      * copyrights and docs
      
      * copyrights and docs
      
      * merge resolution
      
      * move tests, fix to changed optionaldependency structure
      
      * test directories changed
      
      * fixing tests
      
      * how to avoid tf tests?
      
      * how to avoid tf tests?
      
      * tests passing locally
      
      * allow mctctprocessor imported any env
      
      * allow mctctprocessor imported any env
      
      * fixed second round of feedback, need to fix docs
      
      * doc changes not being applied
      
      * all fixed
      
      * style fix
      
      * feedback fixes
      
      * fix copies and feature extraction style fix
      
      * Update tests/models/visual_bert/test_modeling_visual_bert.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * copy paste huggingface:main visual bert
      
      * added eof newline to visual bert; all tests are passing otherwise
      
      * fix slow tests by adding attention mask
      
      * change model id to speechbrain
      
      * make fix-copies
      
      * fix readme unwanted deletes
      
      * fixing readmes, make fix-copies
      
      * consistent M-CTC-T naming
      
      * Update src/transformers/models/mctct/__init__.py
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      
      * all fixed but variable naming
      
      * adjust double quotes
      
      * fixed variable names
      
      * copyright and mr quilter
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * correct slow tests
      
      * make fix-copies
      
      * Update src/transformers/models/mctct/configuration_mctct.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * Update src/transformers/models/mctct/configuration_mctct.py
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      
      * m-ctc-t not mctct
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
      119e3c0f