- 07 Jun, 2024 1 commit
-
-
Matt authored
* Remove ConversationalPipeline and Conversation object, as they have been deprecated for some time and are due for removal * Update not-doctested.txt * Fix JA and ZH docs * Fix JA and ZH docs some more * Fix JA and ZH docs some more
-
- 22 May, 2024 1 commit
-
-
Arthur authored
* update ruff version * fix research projects * Empty * Fix errors --------- Co-authored-by:Lysandre <lysandre@huggingface.co>
-
- 19 Apr, 2024 1 commit
-
-
Jo茫o David authored
* Duplicate swiftformer * Convert SwiftFormerPatchEmbedding * Convert SwiftFormerEmbeddings * Convert TFSwiftFormerMlp * Convert TFSwiftFormerConvEncoder * Convert TFSwiftFormerLocalRepresentation * convert TFSwiftFormerEncoderBlock * Convert SwiftFormerStage * Convert SwiftFormerEncoder * Add TFSWiftFormerPreTrainedModel * Convert SwiftFormerForImageClassification * Add kwargs and start drop path * Fix syntax * Change Model class name * Add TFSwiftFormer to __init__ * Duplicate test_modeling_swiftformer * First test conversions * Change require_torch to require_tf * Add exports to swiftformer __init__ * Add TFSwiftFormerModel wrapper * Fix __init__ and run black * Remove docstring from MainLayer, fix padding * Use keras.layers.Activation on keras.Sequential * Fix swiftformer exports * Fix activation layer from config * Remove post_inits * Use tf.keras.layers.ZeroPadding2D * Convert torch normalize * Change tf test input shape * Fix softmax and reduce_sum * Convert expand_dims and repeat * Add missing reshape and tranpose * Simplify TFSwiftFormerEncoderBlock.call * Fix mismatch in patch embeddings * Fix expected output shape to match channels last * Fix swiftformer typo * Disable test_onnx * Fix TFSwiftFormerForImageClassification call * Add unpack inputs * Convert flatten(2).mean(-1) * Change vision dummy inputs (to be reviewed) * Change test_forward_signature to use .call * Fix @unpack_inputs * Set return_tensors="tf" and rename class * Rename wrongly named patch_embeddings layer * Add serving_output and change dummy_input shape * Make dimensions BCHW and transpose inside embedding layer * Change SwiftFormerEncoderBlock * Fix ruff problems * Add image size to swiftformer config * Change tranpose to MainLayer and use -1 for reshape * Remove serving_outputs and dummy_inputs * Remove test_initialization test from tf model * Make Sequential component a separate layer * Fix layers' names * Tranpose encoder outputs * Fix tests and check if hidden states is not None * Fix TFSwiftFormerForImageClassification * Run make fixup * Run make fix-copies * Update modeling_tf_auto * Update docs * Fix modeling auto mapping * Update modelint_tf_swiftformer docs * Fill image_size doc and type * Add reduction=None to loss computation * Update docs * make style * Debug: Delete the tip to see if that changes anything * Re-add tip * Remove add_code_sample_docstrings * Remove unused import * Get the debug to actually tell us the problem it has with the docs * Try a substitution to match the PyTorch file? * Add swiftformer to ignore list * Add build() methods * Update copyright year Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Remove FIXME comment * Remove from_pt * Update copyright year Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Rename one-letter variables * Remove FIXMEs related to momentum * Remove old TODO comment * Remove outstanding FIXME comments * Get dropout rate from config * Add specific dropout config for MLP * Add convencoder dropout to config * Pass config to SwiftFormerDropPath layer * Fix drop_path variable name and add Adapted from comment * Run ruff * Removed copied from comment * Run fix copies * Change drop_path to identity to match pt * Cleanup build() methods and move to new keras imports * Update docs/source/en/model_doc/swiftformer.md Co-authored-by:
Matt <Rocketknight1@users.noreply.github.com> * Raise error if drop_path_rate > 0.0 * Apply suggestions from code review Replace (self.dim), with self.dim, Co-authored-by:
Matt <Rocketknight1@users.noreply.github.com> * Remove drop_path function * Add training to TFSwiftFormerEncoder * Set self.built = True last Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Should have been added to previous commit Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Change default_feature_extractor to default_image_processor Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Import Keras from modeling_tf_utils * Remove relative import * Run ruff --fix * Move import keras to tf_available * Add copied from comment to test_forward_signature * Reduce batch size and num_labels * Extract loss logic to hf_compute_loss * Run ruff format --------- Co-authored-by:
Matt <rocketknight1@gmail.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by:
Matt <Rocketknight1@users.noreply.github.com>
-
- 18 Mar, 2024 1 commit
-
-
Yoach Lacombe authored
* first modeling code * make repository * still WIP * update model * add tests * add latest change * clean docstrings and copied from * update docstrings md and readme * correct chroma function * correct copied from and remove unreleated test * add doc to toctree * correct imports * add convert script to notdoctested * Add suggestion from Sanchit Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * correct get_uncoditional_inputs docstrings * modify README according to SANCHIT feedback * add chroma to audio utils * clean librosa and torchaudio hard dependencies * fix FE * refactor audio decoder -> audio encoder for consistency with previous musicgen * refactor conditional -> encoder * modify sampling rate logics * modify license at the beginning * refactor all_self_attns->all_attentions * remove ignore copy from causallm generate * add copied from for from_sub_models * fix make copies * add warning if audio is truncated * add copied from where relevant * remove artefact * fix convert script * fix torchaudio and FE * modify chroma method according to feedback-> better naming * refactor input_values->input_features * refactor input_values->input_features and fix import fe * add input_features to docstrigs * correct inputs_embeds logics * remove dtype conversion * refactor _prepare_conditional_hidden_states_kwargs_for_generation ->_prepare_encoder_hidden_states_kwargs_for_generation * change warning for chroma length * Update src/transformers/models/musicgen_melody/convert_musicgen_melody_transformers.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * change way to save wav, using soundfile * correct docs and change to soundfile * fix import * fix init proj layers * remove line breaks from md * fix issue with docstrings * add FE suggestions * improve is in logics and remove useless imports * remove custom from_pretrained * simplify docstring code * add suggestions for modeling tests * make style * update converting script with sanity check * remove encoder attention mask from conditional generation * replace musicgen melody checkpoints with official orga * rename ylacombe->facebook in checkpoints * fix copies * remove unecessary warning * add shape in code docstrings * add files to slow doc tests * fix md bug and add md to not_tested * make fix-copies * fix hidden states test and batching --------- Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
-
- 21 Feb, 2024 1 commit
-
-
Arthur authored
* inital commit * update * update conversion checkpoint * update conversion script * nits * some fixes * nits * merge * fix permute * nits * fix * nits * nits * nits * fix rope * fix both rope * nites * style * make sure flax works * fix flax init code * fix foward * nits * print flax generation out * current code * nits * SIIIIIIIIIIIIIIIIIII * update * add new tokenizer * correct fast tokenizer * fix conversion * more comments * fix modeling and conversion * nits and nits * nits testing * add some tokenization tests * add some edge cases * add slow tests and fix them * fixup * fix copies for modeling * fix copies * add 7B slow tests * fix * fix * fix tests * make tokenizer cis go green * styling * last tokenizer nits * update jax tests * fix flax for 7b * add jit testing
馃 * cleanups * isolated nit, inv_freq for rotary_emb.inv_freq * propagate to jax * Apply suggestions from code review Co-authored-by:Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * adjust test * fix conversion script * change name * correct file names * update conversion script * Fix bos and eos token ids in the model configuration (#3) * update modelling * update conversion script * add static cache for gemma * fix sdpa generate * fix batched * multiple fixes * fix FA2 * final fix * Rename a few missing strings and filenames (#4) * merge with upstream main * fix copies * fix copies * fix fixup * fix fixup * fix * fix * final tests * fix fx gemma tests * fix fx bf16/fp16 tests * update slow fx tests * fx slow tests: one logits, one generation * move jit test standalone * Apply suggestions from code review * nits * tokenizer updates * more tokenization updates: custom GemmaSentencepieceExtrator * style * Update src/transformers/cache_utils.py * Update src/transformers/models/gemma/__init__.py * Update tests/models/gemma/test_modeling_flax_gemma.py * small nits * style * update tokenization test * fix the rotary embedding * with style * fix slow tests * WARNING this commit might be very important for precisions * Update tests/models/gemma/test_modeling_flax_gemma.py * Update src/transformers/models/gemma/configuration_gemma.py Co-authored-by:
Lysandre Debut <hi@lysand.re> * Update src/transformers/models/gemma/modeling_flax_gemma.py Co-authored-by:
Lysandre Debut <hi@lysand.re> * small nits here and there! * forgotten nit * remove on the fly computation of inv_freq * revert previous change, let's be safe and for now re-compute freq cis to make sure it's in float * Apply suggestions from code review Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update src/transformers/models/gemma/convert_gemma_weights_to_hf.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_flax_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_tokenization_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_tokenization_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_tokenization_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_tokenization_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * Update tests/models/gemma/test_modeling_gemma.py Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> * nit conversion script link * fix some tests * add not doctest and pr doctest * repo consistency * fix last CIs
馃殌 * update all readmes --------- Co-authored-by:younesbelkada <younesbelkada@gmail.com> Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by:
Pedro Cuenca <pedro@huggingface.co> Co-authored-by:
Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by:
sanchit-gandhi <sanchit@huggingface.co> Co-authored-by:
Lysandre Debut <hi@lysand.re>
-
- 05 Feb, 2024 1 commit
-
-
amyeroberts authored
* Draft pipeline * Fixup * Fix docstrings * Update doctest * Update pipeline_model_mapping * Update docstring * Update tests * Update src/transformers/pipelines/image_feature_extraction.py Co-authored-by:
Omar Sanseviero <osanseviero@gmail.com> * Fix docstrings - review comments * Remove pipeline mapping for composite vision models * Add to pipeline tests * Remove for flava (multimodal) * safe pil import * Add requirements for pipeline run * Account for super slow efficientnet * Review comments * Fix tests * Swap order of kwargs * Use build_pipeline_init_args * Add back FE pipeline for Vilt * Include image_processor_kwargs in docstring * Mark test as flaky * Update TODO * Update tests/pipelines/test_pipelines_image_feature_extraction.py Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add license header --------- Co-authored-by:
Omar Sanseviero <osanseviero@gmail.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 31 Jan, 2024 1 commit
-
-
Kian Sierra McGettigan authored
* direct copy from llama work * mistral modules forward pass working * flax mistral forward pass with sliding window * added tests * added layer collection approach * Revert "added layer collection approach" This reverts commit 0e2905bf2236ec323163fc1a9f0c016b21aa8b8f. * Revert "Revert "added layer collection approach"" This reverts commit fb17b6187ac5d16da7c461e1130514dc3d137a43. * fixed attention outputs * added mistral to init and auto * fixed import name * fixed layernorm weight dtype * freeze initialized weights * make sure conversion consideres bfloat16 * added backend * added docstrings * added cache * fixed sliding window causal mask * passes cache tests * passed all tests * applied make style * removed commented out code * applied fix-copies ignored other model changes * applied make fix-copies * removed unused functions * passed generation integration test * slow tests pass * fixed slow tests * changed default dtype from jax.numpy.float32 to float32 for docstring check * skip cache test for FlaxMistralForSequenceClassification since if pad_token_id in input_ids it doesn't score previous input_ids * updated checkpoint since from_pt not included * applied black style * removed unused args * Applied styling and fixup * changed checkpoint for doc back * fixed rf after adding it to hf hub * Add dummy ckpt * applied styling * added tokenizer to new ckpt * fixed slice format * fix init and slice * changed ref for placeholder TODO * added copies from Llama * applied styling * applied fix-copies * fixed docs * update weight dtype reconversion for sharded weights * removed Nullable input ids * Removed unnecessary output attentions in Module * added embedding weight initialziation * removed unused past_key_values * fixed deterministic * Fixed RMS Norm and added copied from * removed input_embeds * applied make style * removed nullable input ids from sequence classification model * added copied from GPTJ * added copied from Llama on FlaxMistralDecoderLayer * added copied from to FlaxMistralPreTrainedModel methods * fix test deprecation warning * freeze gpt neox random_params and fix copies * applied make style * fixed doc issue * skipped docstring test to allign # copied from * applied make style * removed FlaxMistralForSequenceClassification * removed unused padding_idx * removed more sequence classification * removed sequence classification * applied styling and consistency * added copied from in tests * removed sequence classification test logic * applied styling * applied make style * removed freeze and fixed copies * undo test change * changed repeat_kv to tile * fixed to key value groups * updated copyright year * split casual_mask * empty to rerun failed pt_flax_equivalence test FlaxWav2Vec2ModelTest * went back to 2023 for tests_pr_documentation_tests * went back to 2024 * changed tile to repeat * applied make style * empty for retry on Wav2Vec2
-
- 18 Jan, 2024 1 commit
-
-
Yoach Lacombe authored
* first commit * correct default value non causal * update config and modeling code * update converting checkpoint * clean modeling and fix tests * make style * add new config parameters to docstring * fix copied from statements * Apply suggestions from code review Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * make position_embeddings_type docstrings clearer * clean converting script * remove function not used * clean modeling file * apply suggestion for test file + add convert script to not_doctested * modify tests according to review - cleaner logic and more tests * Apply nit suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add checker of valid position embeddings type * instantiate new layer norm layer with the right eps * fix freeze_feature_encoder since it can be None in some cases * add test same output in convert script * restore wav2vec2conformer and add new model * create processor and FE + clean * add new model code * fix convert script and set default config parameters * correct model id paths * make style * make fix-copies and cleaning files * fix copied from statements * complete .md and fixe copies * clean convert script argument defaults * fix config parameters docstrings * fix config docstring * add copied from and enrich FE tests * fix copied from and repo-consistency * add autotokenizer * make test input length shorter and change docstring code * fix docstrings and copied from * add add_adapter to ASR training example * make testing of adapters more robust * adapt to multi adapter layers * refactor input_values->input_features and remove w2v2-bert feature extractor * remove pretraining model * remove depreciated features and useless lines * add copied from and ignore statements to modeling tests * remove pretraining model #2 * change import in convert script * change default in convert script * update readme and remove useless line * Update tests/models/wav2vec2_bert/test_processor_wav2vec2_bert.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * refactor BERT to Bert for consistency * remove useless ignore copy statement * add persistent to buffer in rotary * add eps in LayerNorm init and remove copied from * add adapter activation parameters and add copied from statements * Fix copied statements and add unitest.skip reasons * add copied statement in test_processor * refactor processor * make style * replace numpy random by torch rand * remove expected output CTC * improve converting script with processor class * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove gumbel class * remove tests related to previously deleted class * Update src/transformers/models/wav2vec2_bert/configuration_wav2vec2_bert.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * correct typos * remove uused parameters * update processor to takes both text and audio * update checkpoints * update expected output and add ctc expected output * add label_attention_mask * replace pt with np in processor tests * fix typo * revert to behaviour with labels_attention_mask --------- Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 12 Jan, 2024 1 commit
-
-
Matt authored
* Fix TF Regnet docstring * Fix TF Regnet docstring * Make a change to the PyTorch Regnet too to make sure the CI is checking it * Add skips for TFRegnet * Update error message for docstring checker
-
- 11 Jan, 2024 1 commit
-
-
Matt authored
-
- 10 Jan, 2024 1 commit
-
-
Sparty authored
* Remove ErnieConfig, ErnieMConfig check_docstrings * Run fix_and_overwrite for ErnieConfig, ErnieMConfig * Replace <fill_type> and <fill_docstring> in configuration_ernie, configuration_ernie_m.py with type and docstring values --------- Co-authored-by:vignesh-raghunathan <vignesh_raghunathan@intuit.com>
-
- 07 Dec, 2023 1 commit
-
-
Alex McKinney authored
* Copies `modeling_flax_gpt_neo.py` to start * MLP Block. WIP Attention and Block * Adds Flax implementation of `LlamaMLP` Validated with in-file test. Some slight numeric differences, but assuming it isn't an issue * Adds `FlaxLlamaRMSNorm` layer `flax.linen` includes `RMSNorm` layer but not necessarily in all versions. Hence, we add in-file. * Adds FlaxLlamaAttention Copied from GPT-J as it has efficient caching implementation as well as rotary embeddings. Notice numerically different, but not by a huge amount. Needs investigating * Adds `FlaxLlamaDecoderLayer` numerically inaccurate, debugging.. * debugging rotary mismatch gptj uses interleaved whilst llama uses contiguous i think they match now but still final result is wrong. maybe drop back to just debugging attention layer? * fixes bug with decoder layer still somewhat numerically inaccurate, but close enough for now * adds markers for what to implement next the structure here diverges a lot from the PT version. not a big fan of it, but just get something working for now * implements `FlaxLlamaBlockCollection`] tolerance must be higher than expected, kinda disconcerting * Adds `FlaxLlamaModule` equivalent PyTorch model is `LlamaModel` yay! a language model
馃 * adds `FlaxLlamaForCausalLMModule` equivalent to `LlamaForCausalLM` still missing returning dict or tuple, will add later * start porting pretrained wrappers realised it probably needs return dict as a prereq * cleanup, quality, style * readds `return_dict` and model output named tuples * (tentatively) pretrained wrappers work馃敟 * fixes numerical mismatch in `FlaxLlamaRMSNorm` seems `jax.lax.rsqrt` does not match `torch.sqrt`. manually computing `1 / jax.numpy.sqrt` results in matching values. * [WIP] debugging numerics * numerical match I think issue was accidental change of backend. forcing CPU fixes test. We expect some mismatch on GPU. * adds in model and integration tests for Flax Llama summary of failing: - mul invalid combination of dimensions - one numerical mismatch - bf16 conversion (maybe my local backend issue) - params are not FrozenDict * adds missing TYPE_CHECKING import and `make fixup` * adds back missing docstrings needs review on quality of docstrings, not sure what is required. Furthermore, need to check if `CHECKPOINT_FOR_DOC` is valid. See TODO * commenting out equivalence test as can just use common * debugging * Fixes bug where mask and pos_ids were swapped in pretrained models This results in all tests passing now馃敟 * cleanup of modeling file * cleanup of test file * Resolving simpler review comments * addresses more minor review comments * fixing introduced pytest errors from review * wip additional slow tests * wip tests need to grab a GPU machine to get real logits for comparison otherwise, slow tests should be okay * `make quality`, `make style` * adds slow integration tests - checking logits - checking hidden states - checking generation outputs * `make fix-copies` * fix mangled function following `make fix-copies` * adds missing type checking imports * fixes missing parameter checkpoint warning * more finegrained 'Copied from' tags avoids issue of overwriting `LLAMA_INPUTS_DOCSTRING` * swaps import guards ??? how did these get swapped initially? * removing `inv_freq` again as pytorch version has now removed * attempting to get CI to pass * adds doc entries for llama flax models * fixes typo in __init__.py imports * adds back special equivalence tests these come from the gpt neo flax tests. there is special behaviour for these models that needs to override the common version * overrides tests with dummy to see if CI passes need to fill in these tests later * adds my contribution to docs * `make style; make quality` * replaces random masking with fixed to work with flax version * `make quality; make style` * Update src/transformers/models/llama/modeling_flax_llama.py Co-authored-by:Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update src/transformers/models/llama/modeling_flax_llama.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update src/transformers/models/llama/modeling_flax_llama.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update src/transformers/models/llama/modeling_flax_llama.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update src/transformers/models/llama/modeling_flax_llama.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update src/transformers/models/llama/modeling_flax_llama.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * updates `x`->`tensor` in `rotate_half` * addresses smaller review comments * Update docs/source/en/model_doc/llama.md Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * adds integration test class * adds `dtype` to rotary embedding to cast outputs * adds type to flax llama rotary layer * `make style` * `make fix-copies` * Apply suggestions from code review Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * applies suggestions from review * Update modeling_flax_llama.py * `make fix-copies` * Update tests/models/llama/test_modeling_llama.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update src/transformers/models/llama/modeling_flax_llama.py Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * fixes shape mismatch in FlaxLlamaMLP * applies some suggestions from reviews * casts attn output logits to f32 regardless of dtype * adds attn bias using `LlamaConfig.attention_bias` * adds Copied From comments to Flax Llama test * mistral and persimmon test change -copy from llama * updates docs index * removes Copied from in tests it was preventing `make fix-copies` from succeeding * quality and style * ignores FlaxLlama input docstring * adds revision to `_CHECKPOINT_FOR_DOC` * repo consistency and quality * removes unused import * removes copied from from Phi test now diverges from llama tests following FlaxLlama changes * adds `_REAL_CHECKPOINT_FOR_DOC` * removes refs from pr tests * reformat to make ruff happy --------- Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
-
- 30 Nov, 2023 1 commit
-
-
Yoach Lacombe authored
* add working convertion script * first non-working version of modeling code * update modeling code (working) * make style * make fix-copies * add config docstrings * add config to ignore docstrings formatage due to unconventional markdown * fix copies * fix generation num_return_sequences * enrich docs * add and fix tests beside integration tests * update integration tests * update repo id * add tie weights and make style * correct naming in .md * fix imports and so on * correct docstrings * fix fp16 speech forward * fix speechencoder attention * make style * fix copied from * rename SeamlessM4Tv2-v2 to SeamlessM4Tv2 * Apply suggestions on configuration Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * remove useless public models * fix private models + better naming for T2U models * clean speech encoder relative position embeddings * refactor chunk attention * add docstrings to chunk attention method * improve naming and docstrings * rename some attention variables + add temperature sampling in T2U model * rename DOCSTRINGS variable names * make style + remove 2 useless config parameters * enrich model card * remove any attention_head reference + fix temperature in T2U * new fmt and make style * Apply suggestions from code review Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * rename spkr_id->speaker_id and change docstrings of get_char_input_ids * simplify v2attention * make style * Update seamless_m4t_v2.md * update code and tests with last update * update repo ids * fill article name, abstract andauthors * update not_doctested and slow_doc tests --------- Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 24 Nov, 2023 1 commit
-
-
Yih-Dar authored
* fix * fix * trigger * Apply suggestions from code review Co-authored-by:
Lysandre Debut <hi@lysand.re> * tic * revert * revert --------- Co-authored-by:
ydshieh <ydshieh@users.noreply.github.com> Co-authored-by:
Lysandre Debut <hi@lysand.re>
-
- 01 Nov, 2023 1 commit
-
-
Andi Powers Holmes authored
* Add type annotations to TFConvNextDropPath * Use tf.debugging.assert_equal for TFConvNextEmbeddings shape check * Add TensorFlow implementation of ConvNeXTV2 * check_docstrings: add TFConvNextV2Model to exclusions TFConvNextV2Model and TFConvNextV2ForImageClassification have docstrings which are equivalent to their PyTorch cousins, but a parsing issue prevents them from passing the test. Adding exclusions for these two classes as discussed in #25558.
-
- 31 Oct, 2023 1 commit
-
-
Akshar Goyal authored
* [docstring] Fix docstring for AltCLIPVisionConfig, AltCLIPTextConfig + cleaned some docstring * Removed entries from check_docstring.py * Removed entries from check_docstring.py * Removed entry from check_docstring.py * [docstring] Fix docstring for AltCLIPTextConfig, AltCLIPVisionConfig and AltCLIPConfig
-
- 25 Oct, 2023 1 commit
-
-
Younes Belkada authored
* add `MaskGenerationPipeline` in docs * Update __init__.py * fix repo consistency and clarify docstring * add on check docstirngs * actually we do have a tf sam * oops
-
- 23 Oct, 2023 1 commit
-
-
Yoach Lacombe authored
* first raw commit * still POC * tentative convert script * almost working speech encoder conversion scripts * intermediate code for encoder/decoders * add modeling code * first version of speech encoder * make style * add new adapter layer architecture * add adapter block * add first tentative config * add working speech encoder conversion * base model convert works now * make style * remove unnecessary classes * remove unecessary functions * add modeling code speech encoder * rework logics * forward pass of sub components work * add modeling codes * some config modifs and modeling code modifs * save WIP * new edits * same output speech encoder * correct attention mask * correct attention mask * fix generation * new generation logics * erase comments * make style * fix typo * add some descriptions * new state * clean imports * add tests * make style * make beam search and num_return_sequences>1 works * correct edge case issue * correct SeamlessM4TConformerSamePadLayer copied from * replace ACT2FN relu by nn.relu * remove unecessary return variable * move back a class * change name conformer_attention_mask ->conv_attention_mask * better nit code * add some Copied from statements * small nits * small nit in dict.get * rename t2u model -> conditionalgeneration * ongoing refactoring of structure * update models architecture * remove SeamlessM4TMultiModal classes * add tests * adapt tests * some non-working code for vocoder * add seamlessM4T vocoder * remove buggy line * fix some hifigan related bugs * remove hifigan specifc config * change * add WIP tokenization * add seamlessM4T working tokenzier * update tokenization * add tentative feature extractor * Update converting script * update working FE * refactor input_values -> input_features * update FE * changes in generation, tokenizer and modeling * make style and add t2u_decoder_input_ids * add intermediate outputs for ToSpeech models * add vocoder to speech models * update valueerror * update FE with languages * add vocoder convert * update config docstrings and names * update generation code and configuration * remove todos and update config.pad_token_id to generation_config.pad_token_id * move block vocoder * remove unecessary code and uniformize tospeech code * add feature extractor import * make style and fix some copies from * correct consistency + make fix-copies * add processor code * remove comments * add fast tokenizer support * correct pad_token_id in M4TModel * correct config * update tests and codes + make style * make some suggested correstion - correct comments and change naming * rename some attributes * rename some attributes * remove unecessary sequential * remove option to use dur predictor * nit * refactor hifigan * replace normalize_mean and normalize_var with do_normalize + save lang ids to generation config * add tests * change tgt_lang logic * update generation ToSpeech * add support import SeamlessM4TProcessor * fix generate * make tests * update integration tests, add option to only return text and update tokenizer fast * fix wrong function call * update import and convert script * update integration tests + update repo id * correct paths and add first test * update how new attention masks are computed * update tests * take first care of batching in vocoder code * add batching with the vocoder * add waveform lengths to model outputs * make style * add generate kwargs + forward kwargs of M4TModel * add docstrings forward methods * reformate docstrings * add docstrings t2u model * add another round of modeling docstrings + reformate speaker_id -> spkr_id * make style * fix check_repo * make style * add seamlessm4t to toctree * correct check_config_attributes * write config docstrings + some modifs * make style * add docstrings tokenizer * add docstrings to processor, fe and tokenizers * make style * write first version of model docs * fix FE + correct FE test * fix tokenizer + add correct integration tests * fix most tokenization tests * make style * correct most processor test * add generation tests and fix num_return_sequences > 1 * correct integration tests -still one left * make style * correct position embedding * change numbeams to 1 * refactor some modeling code and correct one test * make style * correct typo * refactor intermediate fnn * refactor feedforward conformer * make style * remove comments * make style * fix tokenizer tests * make style * correct processor tests * make style * correct S2TT integration * Apply suggestions from Sanchit code review Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * correct typo * replace torch.nn->nn + make style * change Output naming (waveforms -> waveform) and ordering * nit renaming and formating * remove return None when not necessary * refactor SeamlessM4TConformerFeedForward * nit typo * remove almost copied from comments * add a copied from comment and remove an unecessary dropout * remove inputs_embeds from speechencoder * remove backward compatibiliy function * reformate class docstrings for a few components * remove unecessary methods * split over 2 lines smthg hard to read * make style * replace two steps offset by one step as suggested * nice typo * move warnings * remove useless lines from processor * make generation non-standard test more robusts * remove torch.inference_mode from tests * split integration tests * enrich md * rename control_symbol_vocoder_offset->vocoder_offset * clean convert file * remove tgt_lang and src_lang from FE * change generate docstring of ToText models * update generate docstring of tospeech models * unify how to deal withtext_decoder_input_ids * add default spkr_id * unify tgt_lang for t2u_model * simplify tgt_lang verification * remove a todo * change config docstring * make style * simplify t2u_tgt_lang_id * make style * enrich/correct comments * enrich .md * correct typo in docstrings * add torchaudio dependency * update tokenizer * make style and fix copies * modify SeamlessM4TConverter with new tokenizer behaviour * make style * correct small typo docs * fix import * update docs and add requirement to tests * add convert_fairseq2_to_hf in utils/not_doctested.txt * update FE * fix imports and make style * remove torchaudio in FE test * add seamless_m4t.md to utils/not_doctested.txt * nits and change the way docstring dataset is loaded * move checkpoints from ylacombe/ to facebook/ orga * refactor warning/error to be in the 119 line width limit * round overly precised floats * add stereo audio behaviour * refactor .md and make style * enrich docs with more precised architecture description * readd undocumented models * make fix-copies * apply some suggestions * Apply suggestions from code review Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> * correct bug from previous commit * refactor a parameter allowing to clean the code + some small nits * clean tokenizer * make style and fix * make style * clean tokenizers arguments * add precisions for some tests * move docs from not_tested to slow * modify tokenizer according to last comments * add copied from statements in tests * correct convert script * correct parameter docstring style * correct tokenization * correct multi gpus * make style * clean modeling code * make style * add copied from statements * add copied statements * add support with ASR pipeline * remove file added inadvertently * fix docstrings seamlessM4TModel * add seamlessM4TConfig to OBJECTS_TO_IGNORE due of unconventional markdown * add seamlessm4t to assisted generation ignored models --------- Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 20 Oct, 2023 1 commit
-
-
Adam Ross authored
* Fix docstring for speech-to-text config * Refactor doc line len <= 119 char * Remove Speech2TextConfig from OBJECTS_TO_IGNORE * Fix Speech2TextConfig doc str * Fix Speech2TextConfig doc using doc-builder * Refactor Speech2TextConfig doc
-
- 19 Oct, 2023 2 commits
-
-
Daniil authored
* remove docstrings CodeGen from objects_to_ignore * autofix codegen docstrings * fill in the missing types and docstrings * fixup * change descriptions to be in a separate line * apply docstring suggestions from code review Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> * update n_ctx description in CodeGenConfig --------- Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com>
-
Sparty authored
* Remove ChineseCLIPImageProcessor, ChineseCLIPTextConfig, ChineseCLIPVisionConfig from check_docstrings * Run fix_and_overwrite for ChineseCLIPImageProcessor, ChineseCLIPTextConfig, ChineseCLIPVisionConfig * Replace <fill_type> and <fill_docstring> in configuration_chinese_clip.py, image_processing_chinese_clip.py with type and docstring values --------- Co-authored-by:vignesh-raghunathan <vignesh_raghunathan@intuit.com>
-
- 17 Oct, 2023 1 commit
-
-
louietouie authored
* Deleted LukeConfig and ran check_docstrings.py * Filled docstring information --------- Co-authored-by:louie <louisparizeau@Chicken.local>
-
- 16 Oct, 2023 3 commits
-
-
przemL authored
* Remove BertGenerationTokenizer from objects to ignore The file BertGenerationTokenizer is removed from objects to ignore as a first step to fix docstring. * Docstrings fix for BertGenerationTokenizer Docstring fix is generated for BertGenerationTokenizer by using check_docstrings.py. * Fix docstring for BertGenerationTokenizer Added sep_token type and docstring in BertGenerationTokenizer.
-
Bojun-Feng authored
* remove from OBJECTS_TO_IGNORE * run check_docstrings.py * fill in information * ignore CodeLlamaTokenizer
-
Sparty authored
* Remove CanineConfig from check_docstrings * Run fix_and_overwrite for CanineConfig * Replace <fill_type> and <fill_docstring> in configuration_canine.py with type and docstring values --------- Co-authored-by:vignesh-raghunathan <vignesh_raghunathan@intuit.com>
-
- 13 Oct, 2023 3 commits
-
-
NielsRogge authored
* First draft * Update conversion script * Update copied from statements * Fix style * Add copied from to config * Add copied from to processor * Run make fixup * Add docstring * Update docstrings * Add method * Improve docstrings * Fix docstrings * Improve docstrings * Remove onnx * Add flag * Address comments * Add copied from to model tests * Add flag to conversion script * Add code snippet * Address more comments * Address comment * Improve conversion script * More improvements * Add expected objectness logits * Skip test * Improve conversion script * Extend conversion script * Convert large checkpoint * Fix doc tests * Convert all checkpoints, update integration tests * Add checkpoint_path arg * Fix repo_id
-
dekomori_sanae09 authored
* fix docstring dpr config * fix style * Update descp Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> --------- Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com>
-
Bojun-Feng authored
* update check_docstrings * update docstring
-
- 12 Oct, 2023 3 commits
-
-
Adwait authored
* [docstring] Remove 'BertGenerationConfig' from OBJECTS_TO_IGNORE * [docstring] Fix docstring for 'BertGenerationConfig' (#26638)
-
Joseph McDonnell authored
* [DOCS] Update docstrings for and tokenizer * [DOCS] add pad_token argument to whisper tokenizer docstring * [FIX] Reword pad_token description * [CHORE] Apply style formatting --------- Co-authored-by:jmcdonnell <jmcdonnell@fieldbox.ai>
-
Gizem authored
* Remove UniSpeechConfig * Remove , at the end otherwise check_docstring changes order * Auto add new docstring * Update docstring for UniSpeechConfig * Remove from check_docstrings * Remove UniSpeechSatConfig and UniSpeechSatForCTC from check_docstrings * Remove , at the end * Fix docstring * Update docstring for Wav2Vec2ForCTC * Update Wav2Vec2ForCTC docstring Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> * fix style --------- Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com>
-
- 11 Oct, 2023 3 commits
-
-
Bojun-Feng authored
* update check_docstrings * update docstring
-
Minho Ryang authored
* [docstring] Fix docstring for `LlamaTokenizer` and `LlamaTokenizerFast` * [docstring] Fix docstring typo at `LlamaTokenizer` and `LlamaTokenizerFast`
-
Shivanand authored
* remove from utils * updated doc string * only in the model * Update src/transformers/models/swin/modeling_swin.py Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> * Update src/transformers/models/swin/modeling_swin.py Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com> --------- Co-authored-by:
Yih-Dar <2521628+ydshieh@users.noreply.github.com>
-
- 10 Oct, 2023 1 commit
-
-
Pavarissy authored
* Your commit message here * fix LlamaConfig docstring * run make fixup * fix formatting after review reformat of the file to prevent script issues * rerun make fixup after reformat
-
- 09 Oct, 2023 4 commits
-
-
Isaac Chung authored
fix docstrings for vanilla clip
-
Alex Bzdel authored
* removed donutimageprocessor from objects_to_ignore * added docstring for donutimageprocessor * readding donut file * moved docstring to correct location
-
Isaac Chung authored
fix docstring for CLIPImageProcessor
-
Isaac Chung authored
* fix docstrings for CLIP configs * black formatted
-
- 06 Oct, 2023 1 commit
-
-
Yih-Dar authored
example fix docstring Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-