"vscode:/vscode.git/clone" did not exist on "b85b88069a778f0ffbb7a0f6389e18fca9432dcf"
  1. 20 Mar, 2023 1 commit
  2. 17 Mar, 2023 7 commits
  3. 16 Mar, 2023 3 commits
    • Jason Phang's avatar
      LLaMA Implementation (#21955) · 0041be5b
      Jason Phang authored
      
      
      * LLaMA
      
      * sharding and docs
      
      * tweak
      
      * black
      
      * inits
      
      * ruff
      
      * LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP
      
      * init
      
      * no checkpoint
      
      * docs
      
      * ruff
      
      * type_vocab_size
      
      * tokenizer fixes
      
      * tokenizer fixes
      
      * Update tokenization_llama.py
      
      * Update tokenization_llama.py
      
      * Update configuration_llama.py
      
      * Update modeling_llama.py
      
      * tokenizer add_bos by default
      
      * licenses
      
      * remove decoder
      
      * norms and mlp
      
      * rope overhaul
      
      * tweaks
      
      * black
      
      * mention OPT implementation
      
      * off-by-one naming
      
      * typo
      
      * fix
      
      * tokenization fix and slicing bug
      
      * padding config
      
      * cleanup
      
      * black
      
      * update tests
      
      * undo typo
      
      * fix vocab caching logic
      
      * ruff
      
      * docbuilder
      
      * attn fix from BlackSamorez
      
      * initial feedback
      
      * typo
      
      * docs
      
      * llama case
      
      * llama case
      
      * load checkpoint docs
      
      * comment about tokenizer
      
      * tokenizer defaults
      
      * clear past_key_values if use_cache=False
      
      * last tweaks
      
      * last tweaks
      
      * last tweaks
      
      * last tweaks
      
      ---------
      Co-authored-by: default avatarStella Biderman <stellabiderman@gmail.com>
      0041be5b
    • Baelish03's avatar
      Italian Translation of migration.mdx (#22183) · 09922da4
      Baelish03 authored
      * Tranlstion Italian: migration
      
      * Update migration.mdx
      
      minor fixes
      
      * Update _toctree.yml
      
      * Delete migration.mdx
      
      * Add italian translation of migration.mdx
      
      * Update of migration.mdx translation and toctree
      09922da4
    • Alara Dirik's avatar
      Fix typo in Align docs (#22199) · 1485bd9c
      Alara Dirik authored
      Fix align docs typo
      1485bd9c
  4. 14 Mar, 2023 3 commits
  5. 13 Mar, 2023 7 commits
  6. 10 Mar, 2023 2 commits
  7. 09 Mar, 2023 3 commits
  8. 08 Mar, 2023 2 commits
  9. 07 Mar, 2023 2 commits
    • Eli Simhayev's avatar
      [Time-Series] informer model (#21099) · 8abe4930
      Eli Simhayev authored
      * added informer to gitignore
      
      * added informer to gitignore
      
      * WIP informer2020
      
      * added checking that instantiate works
      
      * added config using gluonTS by kashif
      
      * WIP config
      
      * adding informeConfig. need to remove FeatureEmbedder
      
      * done InformerConfig, but need to change the names
      
      * Done informer model init. working on enc-dec
      
      * added things to address, after reading again enc-dec in the paper
      
      * done modeling - checking initialization work
      
      * added informer to gitignore
      
      * WIP informer2020
      
      * added checking that instantiate works
      
      * added config using gluonTS by kashif
      
      * WIP config
      
      * adding informeConfig. need to remove FeatureEmbedder
      
      * done InformerConfig, but need to change the names
      
      * Done informer model init. working on enc-dec
      
      * added things to address, after reading again enc-dec in the paper
      
      * done modeling - checking initialization work
      
      * moved enc-dec init to InformerEncoder/Decoder init
      
      * added 'init_std' to config, now model init works!
      
      * WIP conversion script, and added code sources
      
      * WIP conversion script: loading original informer pth works
      
      * WIP conversion script: change defaults in the config
      
      * WIP conversion script: supporting Informer input embedding
      
      * WIP conversion script: added parameters for the informer embed
      
      * WIP conversion script: change dim_feedforward=2048
      
      * WIP conversion script: remove unused args for loading checkpoint
      
      * just cleaning up
      
      * DataEmbedding removed, after thinking with Kashif
      
      * working on forward pass
      
      * WIP forward pass: trying to establish working batch for forward pass
      
      * cleaning and finalizing
      
      * adding HF names and docs
      
      * init after cleaning works
      
      * WIP in tests
      
      * added docs for the informer specific args
      
      * fix style
      
      * undo change
      
      * cleaning informer, now need to work only enc-dec
      
      * initial enc-dec classes
      
      * added encoder and decoder
      
      * added todo
      
      * add todos for conv_layers
      
      * added decoder docs from vanilla
      
      * added encoder docs from vanilla
      
      * remove encoder decoder from the original informer
      
      * removed AttentionLayer from the original paper
      
      * removed TriangularCausalMask, same as decoder_attention_mask
      
      * initial sparse attention
      
      * use conv_layers
      
      * fixed test_config test
      
      * fix parenthesis when itearting zip(layers, conv_layers)
      
      * error found in prob attention, added sizes as comments
      
      * fix sizes
      
      * added proposal for q_reduce indexing, and remove unused
      
      * WIP ProbMask, and changed factor=2 for testing
      
      * remove unused libs for this PR for creating the env
      
      * fix checking the attn_weights.size() after bmm
      
      * Q_reduce: changed from torch.gather to simple slicing
      
      * WIP calculate final attn_output
      
      * finish adding v_aggregated, attn_output ready
      
      * changed tgt_len to u in attention_mask, need to fix the size error
      
      * comment attention_mask for encoder, and fix if cond for v_agg
      
      * added ProbMask support (wip), removed old original code
      
      * finished ProbMask 馃槂
      
      
      
      * Revert "remove unused libs for this PR for creating the env"
      
      This reverts commit 11a081e09e92771e51a5d2758d53a9afb59547f0.
      
      * fixes
      
      * make style
      
      * fix initial tests
      
      * fix more tests
      
      * dry
      
      * make style
      
      * remove unused files
      
      * style
      
      * added integration tests
      
      * fix num_static_real_features
      
      * fix header
      
      * remove unused function
      
      * fix example
      
      * fix docs
      
      * Update src/transformers/models/informer/configuration_informer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/informer/modeling_informer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/informer/configuration_informer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/informer/configuration_informer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/informer/configuration_informer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * Update src/transformers/models/informer/configuration_informer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * fixes for reviewer
      
      * use prediction_length from model
      
      * fix style
      
      * fixed informer.mdx
      
      * added to index
      
      * updated readme
      
      * undo
      
      * make fix-copies
      
      * typo
      
      * fix copy
      
      * added Informer to toctree
      
      * in order
      
      * fixed comments
      
      * remove unneeded new lines in docs
      
      * make static real and cat optional
      
      * fix use of distil conv layers
      
      * fixed integration test
      
      * added checkpoint for convlayer
      
      * make fix-copies
      
      * updated from time series model
      
      * make fix-copies
      
      * copy decoder
      
      * fix unit tests
      
      * updated scaling config
      
      * fix integration tests
      
      * IGNORE_NON_TESTED
      
      * IGNORE_NON_AUTO_CONFIGURED
      
      * IGNORE_NON_AUTO_CONFIGURED
      
      * updated check configs
      
      * fix formatting
      
      * undo change from time series
      
      * prediction_length should not be None
      
      * aliign with the blog: prettify ProbSparse and change attention_factor  to sampling_factor
      
      * make style
      
      * make fix-copies
      
      * niels CR: update contributed by
      
      * niels CR: update configuration_informer.py
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * niels CR: update kashif -> huggingface
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      
      * niels CR: `sampling_factor` only relevant when `attention_type`=prob
      
      * make style
      
      * fixed U_part: added multiplication by `L_Q`
      
      * fixed bug: remove `is not None` from `if config.distil`
      
      * fixed test: `decoder_seq_length` to `encoder_seq_length` in cross_attentions check
      
      * fix integration tests
      
      * updated model hub
      
      * do not shift as in training
      
      * undo
      
      * fix make-copies
      
      * make fix-copies
      
      * added `if prediction_length is None`
      
      * changed `ProbSparseAttention` to `InformerProbSparseAttention`
      
      * changed `V_sum` -> `v_mean_dim_time`
      
      * changed `ConvLayer` to `InformerConvLayer` and fixed `super()`
      
      * TimeSeriesTansformer->Informer in decoder's Copied from
      
      * more descriptive in ProbSparse
      
      * make style
      
      * fix coped from
      
      * Revert "added `if prediction_length is None`"
      
      This reverts commit b4cbddfa05e3bd739b79569cd3c3b89e316f2451.
      
      * fixed indent
      
      * use InformerSinusoidalPositionalEmbedding
      
      * make fix-style
      
      * fix from #21860
      
      * fix name
      
      * make fix-copies
      
      * use time series utils
      
      * fix dec num_heads
      
      * docstring
      
      * added time series util doc
      
      * _import_structure
      
      * formatting
      
      * changes from review
      
      * make style
      
      * fix docs
      
      * fix doc
      
      * removed NegativeLogLikelihood
      
      ---------
      Co-authored-by: default avatarKashif Rasul <kashif.rasul@gmail.com>
      Co-authored-by: default avatarNielsRogge <48327001+NielsRogge@users.noreply.github.com>
      8abe4930
    • Sanchit Gandhi's avatar
      [Whisper] Add model for audio classification (#21754) · 7c393181
      Sanchit Gandhi authored
      * [Whisper] Add model for audio classification
      
      * make fix-copies
      
      * add to docs
      
      * add docstring
      
      * empty returns
      
      * add code example
      
      * switch to fleurs
      
      * stick everything on one line
      7c393181
  10. 06 Mar, 2023 1 commit
  11. 03 Mar, 2023 1 commit
  12. 01 Mar, 2023 6 commits
  13. 28 Feb, 2023 2 commits