Commits · 404ff8fc17599788a546818373be113b1fc8456a · chenpangpang / transformers

05 Sep, 2023 1 commit
- Fix typo (#25966) · 404ff8fc
  Susnato Dhar authored Sep 05, 2023
```
* Update feature_extraction_clap.py

* changed all lenght to length
```
  404ff8fc
03 Aug, 2022 1 commit

Fix torch version comparisons (#18460) · 02b176c4

LSinev authored Aug 03, 2022

Comparisons like
version.parse(torch.__version__) > version.parse("1.6")
are True for torch==1.6.0+cu101 or torch==1.6.0+cpu

version.parse(version.parse(torch.__version__).base_version) are preferred (and available in pytorch_utils.py

02b176c4

29 Jul, 2022 1 commit

Replace `as_target` context managers by direct calls (#18325) · 986526a0

Sylvain Gugger authored Jul 29, 2022



* Preliminary work on tokenizers

* Quality + fix tests

* Treat processors

* Fix pad

* Remove all uses of  in tests, docs and examples

* Replace all as_target_tokenizer

* Fix tests

* Fix quality

* Update examples/flax/image-captioning/run_image_captioning_flax.py
Co-authored-by: amyeroberts <amy@huggingface.co>

* Style
Co-authored-by: amyeroberts <amy@huggingface.co>

986526a0

12 May, 2022 1 commit

Black preview (#17217) · afe5d42d

Sylvain Gugger authored May 12, 2022

* Black preview

* Fixup too!

* Fix check copies

* Use the same version as the CI

* Bump black

afe5d42d

30 Mar, 2022 1 commit
- [examples] max samples can't be bigger than the len of dataset (#16501) · a73281e3
  Stas Bekman authored Mar 30, 2022
```
* [examples] max samples can't be bigger than then len of dataset

* do tf and flax
```
  a73281e3
11 Nov, 2021 1 commit
- fix --gradient_checkpointing (#13964) · 77262ef7
  Stas Bekman authored Nov 11, 2021
  
  77262ef7
25 Jun, 2021 1 commit
- remove extra white space from log format (#12360) · 4a872cae
  Stas Bekman authored Jun 25, 2021
  
  4a872cae
14 Apr, 2021 1 commit
- Save the Wav2Vec2 processor before training starts (#10910) · 653076ca
  Nithin Holla authored Apr 14, 2021
```
Co-authored-by: nithin19 <nithin@amberscript.com>
```
  653076ca
18 Mar, 2021 2 commits

add run_common_voice script (#10767) · 5f19c07a

Suraj Patil authored Mar 18, 2021

* add initial script

* finish script

* add shell script example

* accept chars_to_ignor as cl arg

* align the script with other example scripts

* add torchaudio dep

5f19c07a

wav2vec2: support datasets other than LibriSpeech (#10581) · af8afdc8

Mohamed El-Geish authored Mar 18, 2021

* wav2vec2: support datasets other than LibriSpeech

* Formatting run_asr.py to pass code quality test

* bundled orthography options and added verbose logs

* fixing a typo in timit fine-tuning script

* update comment for clarity

* resize_lm_head and load custom vocab from file

* adding a max_duration_in_seconds filter

* do not assign `duration_filter` lambda, use a def

* log untransliterated text as well

* fix base model for arabic

* fix duration filter when target_sr is not set

* drop duration_in_seconds when unneeded

* script for wav2vec2-large-lv60-timit-asr

* fix for "tha" in arabic corpus (huggingface#10581)

* adding more options to work with common_voice

* PR feedback (huggingface#10581)

* small README change

af8afdc8

05 Mar, 2021 1 commit
- fix run seq2seq (#10547) · 395ffcd7
  Patrick von Platen authored Mar 05, 2021
  
  395ffcd7
01 Mar, 2021 1 commit

Add Fine-Tuning for Wav2Vec2 (#10145) · 0234de84

Patrick von Platen authored Mar 01, 2021



* add encode labels function to tokenizer

* start adding finetuning

* init dropout

* upload

* correct convert script

* apply changes

* fix second typo

* make first dummy training run

* adapt convert script

* push confg for comparison

* remove conf

* finish training

* adapt data collator

* add research folder

* update according to fairseq feedback

* some minor corrections

* refactor masking indices a bit

* some minor changes

* clean tokenizer

* finish clean-up

* remove previous logic

* update run script

* correct training

* finish changes

* finish model

* correct bug

* fix training a bit more

* add some tests

* finish gradient checkpointing

* finish example

* correct gradient checkpointing

* improve tokenization method

* revert changes in tokenizer

* revert general change

* adapt fine-tuning

* update

* save intermediate test

* Update README.md

* finish finetuning

* delete conversion script

* Update src/transformers/models/wav2vec2/configuration_wav2vec2.py

* Update src/transformers/models/wav2vec2/processing_wav2vec2.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* finish wav2vec2 script

* finish wav2vec2 fine-tuning

* finalize test

* correct test

* adapt tests

* finish

* remove test file
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

0234de84