1. 01 Sep, 2023 4 commits
    • Arthur's avatar
      Update-llama-code (#25826) · a4dd53d8
      Arthur authored
      
      
      * some bug fixes
      
      * updates
      
      * Update code_llama.md
      Co-authored-by: default avatarOmar Sanseviero <osanseviero@users.noreply.github.com>
      
      * Add co author
      Co-authored-by: default avatarpcuenca <pedro@latenitesoft.com>
      
      * add a test
      
      * fixup
      
      * nits
      
      * some updates
      
      * fix-coies
      
      * adress comments
      
      * nits
      
      * nits
      
      * fix docsting
      
      * Apply suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * update
      
      * add int for https://huggingface.co/spaces/hf-accelerate/model-memory-usage
      
      
      
      ---------
      Co-authored-by: default avatarOmar Sanseviero <osanseviero@users.noreply.github.com>
      Co-authored-by: default avatarpcuenca <pedro@latenitesoft.com>
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      a4dd53d8
    • Sanchit Gandhi's avatar
      [MMS] Update docs with HF TTS implementation (#25907) · 1fa2d89a
      Sanchit Gandhi authored
      
      
      * [MMS] Update docs with HF TTS implementation
      
      * Update docs/source/en/model_doc/mms.md
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      
      * add uromanise to docs
      
      ---------
      Co-authored-by: default avatarArthur <48595927+ArthurZucker@users.noreply.github.com>
      1fa2d89a
    • Omar Sanseviero's avatar
      Remove broken docs for MusicGen (#25905) · 69c5b8f1
      Omar Sanseviero authored
      Update musicgen.md
      69c5b8f1
    • Matthijs Hollemans's avatar
      add VITS model (#24085) · 4ece3b94
      Matthijs Hollemans authored
      
      
      * add VITS model
      
      * let's vits
      
      * finish TextEncoder (mostly)
      
      * rename VITS to Vits
      
      * add StochasticDurationPredictor
      
      * ads flow model
      
      * add generator
      
      * correctly set vocab size
      
      * add tokenizer
      
      * remove processor & feature extractor
      
      * add PosteriorEncoder
      
      * add missing weights to SDP
      
      * also convert LJSpeech and VCTK checkpoints
      
      * add training stuff in forward
      
      * add placeholder tests for tokenizer
      
      * add placeholder tests for model
      
      * starting cleanup
      
      * let the great renaming begin!
      
      * use config
      
      * global_conditioning
      
      * more cleaning
      
      * renaming variables
      
      * more renaming
      
      * more renaming
      
      * it never ends
      
      * reticulating the splines
      
      * more renaming
      
      * HiFi-GAN
      
      * doc strings for main model
      
      * fixup
      
      * fix-copies
      
      * don't make it a PreTrainedModel
      
      * fixup
      
      * rename config options
      
      * remove training logic from forward pass
      
      * simplify relative position
      
      * use actual checkpoint
      
      * style
      
      * PR review fixes
      
      * more review changes
      
      * fixup
      
      * more unit tests
      
      * fixup
      
      * fix doc test
      
      * add integration test
      
      * improve tokenizer tests
      
      * add tokenizer integration test
      
      * fix tests on GPU (gave OOM)
      
      * conversion script can handle repos from hub
      
      * add conversion script for all MMS-TTS checkpoints
      
      * automatically create a README for the converted checkpoint
      
      * small changes to config
      
      * push README to hub
      
      * only show uroman note for checkpoints that need it
      
      * remove conversion script because code formatting breaks the readme
      
      * make WaveNet layers configurable
      
      * rename variables
      
      * simplifying the math
      
      * output attentions and hidden states
      
      * remove VitsFlip in flow model
      
      * also got rid of the other flip
      
      * fix tests
      
      * rename more variables
      
      * rename tokenizer, add phonemization
      
      * raise error when phonemizer missing
      
      * re-order config docstrings to match method
      
      * change config naming
      
      * remove redundant str -> list
      
      * fix copyright: vits authors -> kakao enterprise
      
      * (mean, log_variances) -> (prior_mean, prior_log_variances)
      
      * if return dict -> if not return dict
      
      * speed -> speaking rate
      
      * Apply suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * update fused tanh sigmoid
      
      * reduce dims in tester
      
      * audio -> output_values
      
      * audio -> output_values in tuple out
      
      * fix return type
      
      * fix return type
      
      * make _unconstrained_rational_quadratic_spline a function
      
      * all nn's to accept a config
      
      * add spectro to output
      
      * move {speaking rate, noise scale, noise scale duration} to config
      
      * path -> attn_path
      
      * idxs -> valid idxs -> padded idxs
      
      * output values -> waveform
      
      * use config for attention
      
      * make generation work
      
      * harden integration test
      
      * add spectrogram to dict output
      
      * tokenizer refactor
      
      * make style
      
      * remove 'fake' padding token
      
      * harden tokenizer tests
      
      * ron norm test
      
      * fprop / save tests deterministic
      
      * move uroman to tokenizer as much as possible
      
      * better logger message
      
      * fix vivit imports
      
      * add uroman integration test
      
      * make style
      
      * up
      
      * matthijs -> sanchit-gandhi
      
      * fix tokenizer test
      
      * make fix-copies
      
      * fix dict comprehension
      
      * fix config tests
      
      * fix model tests
      
      * make outputs consistent with reverse/not reverse
      
      * fix key concat
      
      * more model details
      
      * add author
      
      * return dict
      
      * speaker error
      
      * labels error
      
      * Apply suggestions from code review
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * Update src/transformers/models/vits/convert_original_checkpoint.py
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * remove uromanize
      
      * add docstrings
      
      * add docstrings for tokenizer
      
      * upper-case skip messages
      
      * fix return dict
      
      * style
      
      * finish tests
      
      * update checkpoints
      
      * make style
      
      * remove doctest file
      
      * revert
      
      * fix docstring
      
      * fix tokenizer
      
      * remove uroman integration test
      
      * add sampling rate
      
      * fix docs / docstrings
      
      * style
      
      * add sr to model output
      
      * fix outputs
      
      * style / copies
      
      * fix docstring
      
      * fix copies
      
      * remove sr from model outputs
      
      * Update utils/documentation_tests.txt
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      
      * add sr as allowed attr
      
      ---------
      Co-authored-by: default avatarsanchit-gandhi <sanchit@huggingface.co>
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
      4ece3b94
  2. 31 Aug, 2023 1 commit
  3. 30 Aug, 2023 2 commits
  4. 29 Aug, 2023 7 commits
  5. 28 Aug, 2023 1 commit
  6. 25 Aug, 2023 5 commits
  7. 23 Aug, 2023 2 commits
  8. 22 Aug, 2023 4 commits
  9. 21 Aug, 2023 2 commits
    • Susnato Dhar's avatar
      Add Pop2Piano (#21785) · 450a181d
      Susnato Dhar authored
      
      
      * init commit
      
      * config updated also some modeling
      
      * Processor and Model config combined
      
      * extraction pipeline(upto before spectogram & mel_conditioner) added but not properly tested
      
      * model loading successful!
      
      * feature extractor done!
      
      * FE can now be called from HF
      
      * postprocessing added in fe file
      
      * same as prev commit
      
      * Pop2PianoConfig doc done
      
      * cfg docs slightly changed
      
      * fe docs done
      
      * batched
      
      * batched working!
      
      * temp
      
      * v1
      
      * checking
      
      * trying to go with generate
      
      * with generate and model tests passed
      
      * before rebasing
      
      * .
      
      * tests done docs done remaining others & nits
      
      * nits
      
      * LogMelSpectogram shifted to FeatureExtractor
      
      * is_tf rmeoved from pop2piano/init
      
      * import solved
      
      * tokenization tests added
      
      * minor fixed regarding modeling_pop2piano
      
      * tokenizer changed to only return midi_object and other changes
      
      * Updated paper abstract(Camera-ready version) (#2)
      
      * more comments and nits
      
      * ruff changes
      
      * code quality fix
      
      * sg comments
      
      * t5 change added and rebased
      
      * comments except batching
      
      * batching done
      
      * comments
      
      * small doc fix
      
      * example removed from modeling
      
      * ckpt
      
      * forward it compatible with fe and generation done
      
      * comments
      
      * comments
      
      * code-quality fix(maybe)
      
      * ckpts changed
      
      * doc file changed from mdx to md
      
      * test fixes
      
      * tokenizer test fix
      
      * changes
      
      * nits done main changes remaining
      
      * code modified
      
      * Pop2PianoProcessor added with tests
      
      * other comments
      
      * added Pop2PianoProcessor to dummy_objects
      
      * added require_onnx to modeling file
      
      * changes
      
      * update .md file
      
      * remove extra line in index.md
      
      * back to the main index
      
      * added pop2piano to index
      
      * Added tokenizer.__call__ with valid args and batch_decode and aligned the processor part too
      
      * changes
      
      * added return types to 2 tokenizer methods
      
      * the PR build test might work now
      
      * added backends
      
      * PR build fix
      
      * vocab added
      
      * comments
      
      * refactored vocab into 1 file
      
      * added conversion script
      
      * comments
      
      * essentia version changed in .md
      
      * comments
      
      * more tokenizer tests added
      
      * minor fix
      
      * tests extended for outputs acc check
      
      * small fix
      
      ---------
      Co-authored-by: default avatarJongho Choi <sweetcocoa@snu.ac.kr>
      450a181d
    • mchau's avatar
      fix documentation for CustomTrainer (#25635) · 6f041fcb
      mchau authored
      fix doc
      6f041fcb
  10. 18 Aug, 2023 6 commits
  11. 17 Aug, 2023 3 commits
    • Yoach Lacombe's avatar
      Add Text-To-Speech pipeline (#24952) · b8f69d0d
      Yoach Lacombe authored
      
      
      * add AutoModelForTextToSpeech class
      
      * add TTS pipeline and tessting
      
      * add docstrings to text_to_speech pipeline
      
      * fix torch dependency
      
      * corrector 'processor is None' case in Pipeline
      
      * correct repo id
      
      * modify text-to-speech -> text-to-audio
      
      * remove processor
      
      * rename text_to_speech pipelines files to text_audio
      
      * add textToWaveform and textToSpectrogram instead of textToAudio classes
      
      * update TTS pipeline to the bare minimum
      
      * update tests TTS pipeline
      
      * make style and erase useless import torch in TTS pipeline tests
      
      * modify how to check if generate or forward in TTS pipeline
      
      * remove unnecessary extra new lines
      
      * Apply suggestions from code review
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      
      * refactor input_texts -> text_inputs
      
      * correct docstrings of TTS.__call__
      
      * correct the shape of generated waveform
      
      * take care of Bark tokenizer special case
      
      * correct run_pipeline_test TTS
      
      * make style
      
      * update TTS docstrings
      
      * address Sylvain nit refactors
      
      * make style
      
      * refactor into one liners
      
      * correct squeeze
      
      * correct way to test if forward or generate
      
      * Update output audio waveform shape
      
      * make style
      
      * correct import
      
      * modify how the TTS pipeline test if a model can generate
      
      * align shape output of TTS pipeline with consistent shape
      
      ---------
      Co-authored-by: default avatarSanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
      b8f69d0d
    • Alex McKinney's avatar
      Adds `TRANSFORMERS_TEST_DEVICE` (#25506) · 1791ef8d
      Alex McKinney authored
      * Adds `TRANSFORMERS_TEST_DEVICE`
      Mirrors the same API in the diffusers library. Useful in transformers
      too.
      
      * replace backend checking with trying `torch.device`
      
      * Adds better error message for unknown test devices
      
      * `make style`
      
      * adds documentation showing `TRANSFORMERS_TEST_DEVICE` usage.
      1791ef8d
    • Younes Belkada's avatar
      [`Docs`] Fix un-rendered images (#25561) · e7e9261a
      Younes Belkada authored
      fix un-rendered images
      e7e9261a
  12. 16 Aug, 2023 1 commit
  13. 14 Aug, 2023 1 commit
  14. 13 Aug, 2023 1 commit