- 02 Feb, 2024 1 commit
-
-
Klaus Hipp authored
* Fix typos and grammar mistakes in docs and examples * Fix typos in docstrings and comments * Fix spelling of `tokenizer` in model tests * Remove erroneous spaces in decorators * Remove extra spaces in Markdown link texts
-
- 01 Feb, 2024 1 commit
-
-
zspo authored
Co-authored-by:p_spozzhang <p_spozzhang@tencent.com>
-
- 19 Jan, 2024 1 commit
-
-
Amy Roberts authored
-
- 18 Jan, 2024 2 commits
-
-
Yoach Lacombe authored
* add w2v2bert compatibility * Update examples/pytorch/speech-recognition/run_speech_recognition_ctc.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
Yoach Lacombe authored
* first commit * correct default value non causal * update config and modeling code * update converting checkpoint * clean modeling and fix tests * make style * add new config parameters to docstring * fix copied from statements * Apply suggestions from code review Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * make position_embeddings_type docstrings clearer * clean converting script * remove function not used * clean modeling file * apply suggestion for test file + add convert script to not_doctested * modify tests according to review - cleaner logic and more tests * Apply nit suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add checker of valid position embeddings type * instantiate new layer norm layer with the right eps * fix freeze_feature_encoder since it can be None in some cases * add test same output in convert script * restore wav2vec2conformer and add new model * create processor and FE + clean * add new model code * fix convert script and set default config parameters * correct model id paths * make style * make fix-copies and cleaning files * fix copied from statements * complete .md and fixe copies * clean convert script argument defaults * fix config parameters docstrings * fix config docstring * add copied from and enrich FE tests * fix copied from and repo-consistency * add autotokenizer * make test input length shorter and change docstring code * fix docstrings and copied from * add add_adapter to ASR training example * make testing of adapters more robust * adapt to multi adapter layers * refactor input_values->input_features and remove w2v2-bert feature extractor * remove pretraining model * remove depreciated features and useless lines * add copied from and ignore statements to modeling tests * remove pretraining model #2 * change import in convert script * change default in convert script * update readme and remove useless line * Update tests/models/wav2vec2_bert/test_processor_wav2vec2_bert.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * refactor BERT to Bert for consistency * remove useless ignore copy statement * add persistent to buffer in rotary * add eps in LayerNorm init and remove copied from * add adapter activation parameters and add copied from statements * Fix copied statements and add unitest.skip reasons * add copied statement in test_processor * refactor processor * make style * replace numpy random by torch rand * remove expected output CTC * improve converting script with processor class * Apply suggestions from code review Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove gumbel class * remove tests related to previously deleted class * Update src/transformers/models/wav2vec2_bert/configuration_wav2vec2_bert.py Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com> * correct typos * remove uused parameters * update processor to takes both text and audio * update checkpoints * update expected output and add ctc expected output * add label_attention_mask * replace pt with np in processor tests * fix typo * revert to behaviour with labels_attention_mask --------- Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> Co-authored-by:
amyeroberts <22614925+amyeroberts@users.noreply.github.com>
-
- 11 Jan, 2024 1 commit
-
-
Alex Hedges authored
While using `run_clm.py`,[^1] I noticed that some files were being added to my global cache, not the local cache. I set the `cache_dir` parameter for the one call to `evaluate.load()`, which partially solved the problem. I figured that while I was fixing the one script upstream, I might as well fix the problem in all other example scripts that I could. There are still some files being added to my global cache, but this appears to be a bug in `evaluate` itself. This commit at least moves some of the files into the local cache, which is better than before. To create this PR, I made the following regex-based transformation: `evaluate\.load\((.*?)\)` -> `evaluate\.load\($1, cache_dir=model_args.cache_dir\)`. After using that, I manually fixed all modified files with `ruff` serving as useful guidance. During the process, I removed one existing usage of the `cache_dir` parameter in a script that did not have a corresponding `--cache-dir` argument declared. [^1]: I specifically used `pytorch/language-modeling/run_clm.py` from v4.34.1 of the library. For the original code, see the following URL: https://github.com/huggingface/transformers/tree/acc394c4f5e1283c19783581790b3dc3105a3697/examples/pytorch/language-modeling/run_clm.py.
-
- 13 Dec, 2023 1 commit
-
-
Lysandre authored
-
- 27 Nov, 2023 1 commit
-
-
Peter Pan authored
* docs: replace torch.distributed.run by torchrun `transformers` now officially support pytorch >= 1.10. The entrypoint `torchrun`` is present from 1.10 onwards. Signed-off-by:
Peter Pan <Peter.Pan@daocloud.io> * Update src/transformers/trainer.py with @ArthurZucker's suggestion Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Signed-off-by:
Peter Pan <Peter.Pan@daocloud.io> Co-authored-by:
Arthur <48595927+ArthurZucker@users.noreply.github.com>
-
- 17 Nov, 2023 1 commit
-
-
V.Prasanna kumar authored
fixed the broken links belogs to dataset library of transformers
-
- 02 Nov, 2023 1 commit
-
-
Lysandre authored
-
- 31 Oct, 2023 1 commit
-
-
Dong-geon Lee authored
-
- 27 Oct, 2023 1 commit
-
-
Lucain authored
-
- 12 Oct, 2023 1 commit
-
-
Tom Aarsen authored
Add missing spaces in adjacent strings
-
- 03 Oct, 2023 1 commit
-
-
Lysandre authored
-
- 05 Sep, 2023 1 commit
-
-
Susnato Dhar authored
* Update feature_extraction_clap.py * changed all lenght to length
-
- 04 Sep, 2023 1 commit
-
-
Lysandre authored
-
- 21 Aug, 2023 1 commit
-
-
Sylvain Gugger authored
-
- 07 Aug, 2023 1 commit
-
-
Jackmin801 authored
* pytorch examples * pytorch mim no trainer * cookiecutter * flax examples * missed line in pytorch run_glue * tensorflow examples * tensorflow run_clip * tensorflow run_mlm * tensorflow run_ner * tensorflow run_clm * pytorch example from_configs * pytorch no trainer examples * Revert "tensorflow run_clip" This reverts commit 261f86ac1f1c9e05dd3fd0291e1a1f8e573781d5. * fix: duplicated argument
-
- 02 Aug, 2023 1 commit
-
-
Yih-Dar authored
* fix * fix * fix * fix * fix * fix * fix --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 28 Jul, 2023 1 commit
-
-
Yih-Dar authored
* pytorch examples * tensorflow examples * flax examples --------- Co-authored-by:ydshieh <ydshieh@users.noreply.github.com>
-
- 20 Jul, 2023 1 commit
-
-
Zach Mueller authored
Change logic
-
- 17 Jul, 2023 1 commit
-
-
Sylvain Gugger authored
-
- 14 Jun, 2023 1 commit
-
-
Patrick von Platen authored
* Add mms ctc fine tuning * make style * More fixes that are needed * make fix-copies * make draft for README * add new file * move to new file * make style * make style * add quick test * make style * make style
-
- 07 Jun, 2023 1 commit
-
-
Sylvain Gugger authored
-
- 10 May, 2023 1 commit
-
-
Maria Khalusova authored
trainer parameters changed to save tokenizer in addition to feature_extractor
-
- 09 May, 2023 1 commit
-
-
Sylvain Gugger authored
-
- 13 Apr, 2023 1 commit
-
-
Sylvain Gugger authored
-
- 05 Apr, 2023 1 commit
-
-
Mikel Penagarikano authored
* Update run_speech_recognition_ctc.py Make sure all processes wait until data is saved before loading the processor from the output_dit * Make sure all processes wait until data is saved before loading the processor from the output_dit * Update run_speech_recognition_ctc.py * Update run_speech_recognition_seq2seq.py
-
- 14 Mar, 2023 1 commit
-
-
Sylvain Gugger authored
-
- 08 Mar, 2023 1 commit
-
-
bofeng huang authored
* Add specaugment to run_speech_recognition_seq2seq.py * Remove useless argument: text_column * Fix quality * Update return_attention_mask condition * Update specaugment arguments only for whisper models * Remove SpecAugment arguments from ModelArguments, only leave default values for simplicity * Apply suggestions from code review Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Update apply_spec_augment only for whisper models * Apply suggestions from code review Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * Rename return_attention_mask to forward_attention_mask to avoid confusion with wav2vec2 models --------- Co-authored-by:
Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
-
- 22 Feb, 2023 1 commit
-
-
Aaron Gokaslan authored
-
- 09 Feb, 2023 1 commit
-
-
lee1jun authored
Update run_speech_recognition_ctc.py There should be `# limitations under the License` line at the end of the documentation section.
-
- 06 Feb, 2023 1 commit
-
-
Sylvain Gugger authored
* Result of black 23.1 * Update target to Python 3.7 * Switch flake8 to ruff * Configure isort * Configure isort * Apply isort with line limit * Put the right black version * adapt black in check copies * Fix copies
-
- 23 Jan, 2023 1 commit
-
-
Sylvain Gugger authored
-
- 07 Dec, 2022 1 commit
-
-
Emmanuel Schmidbauer authored
-
- 06 Dec, 2022 1 commit
-
-
Francisco Kurucz authored
-
- 01 Dec, 2022 1 commit
-
-
Sylvain Gugger authored
-
- 18 Nov, 2022 1 commit
-
-
Sanchit Gandhi authored
* [ASR Examples] Update README for seq2seq * add language info * add training results * re-word
-
- 14 Nov, 2022 1 commit
-
-
Sanchit Gandhi authored
* merge conflicts * bos and eos in datacollator * (temp) hardcode removal of attention mask * freeze encoder * actually freeze encoder * set max length / num beams according to gen kwargs * (temp) fix tests * don't pop attn mask * override return attention mask config from Hub * Hub configs updated
馃 * final fixes * update type annotations * backward comp
-
- 04 Nov, 2022 1 commit
-
-
bhuang authored
-