"docs/vscode:/vscode.git/clone" did not exist on "831f3144a6f23c1f53dc6a1187bb0ae3f63ad0cf"
  1. 07 Feb, 2022 7 commits
  2. 04 Feb, 2022 6 commits
  3. 03 Feb, 2022 8 commits
  4. 02 Feb, 2022 12 commits
    • CHI LIU's avatar
      Correct eos_token_id settings in generate (#15403) · 5ec368d7
      CHI LIU authored
      * Correct eos_token_id set in generate
      
      * Set eos_token_id in test
      
      * Correct eos_token_id set in generate
      
      * Set eos_token_id in test
      5ec368d7
    • SaulLu's avatar
      fix set truncation attribute in `__init__` of `PreTrainedTokenizerBase` (#15456) · 39b5d1a6
      SaulLu authored
      
      
      * change truncation_side in init of `PreTrainedTokenizerBase`
      Co-authored-by: default avatarLSinev <LSinev@users.noreply.github.com>
      
      * add test
      
      * Revert "replace assert with exception for `padding_side` arg in `PreTrainedTokenizerBase` `__init__`"
      
      This reverts commit 7a98b87962d2635c7e4d4f00db3948b694624843.
      
      * fix kwargs
      
      * Revert "fix kwargs"
      
      This reverts commit 67b0a5270e8cf1dbf70e6b0232e94c0452b6946f.
      
      * Update tests/test_tokenization_common.py
      Co-authored-by: default avatarNicolas Patry <patry.nicolas@protonmail.com>
      
      * delete truncation_side variable
      
      * reorganize test
      
      * format
      
      * complete doc
      
      * Revert "Revert "replace assert with exception for `padding_side` arg in `PreTrainedTokenizerBase` `__init__`""
      
      This reverts commit d5a10a7e2680539e5d9e98ae5d896c893d224b80.
      
      * fix typo
      
      * fix typos to render documentation
      
      * Revert "Revert "Revert "replace assert with exception for `padding_side` arg in `PreTrainedTokenizerBase` `__init__`"""
      
      This reverts commit 16cf58811943a08f43409a7c83eaa330686591d0.
      
      * format
      Co-authored-by: default avatarLSinev <LSinev@users.noreply.github.com>
      Co-authored-by: default avatarNicolas Patry <patry.nicolas@protonmail.com>
      39b5d1a6
    • Sylvain Gugger's avatar
      Fix labels stored in model config for token classification examples (#15482) · 45cac3fa
      Sylvain Gugger authored
      * Playing
      
      * Properly set labels in model config for token classification example
      
      * Port to run_ner_no_trainer
      
      * Quality
      45cac3fa
    • Ayush Chaurasia's avatar
      Add W&B backend for hyperparameter sweep (#14582) · c74f3d4c
      Ayush Chaurasia authored
      # Add support for W&B hyperparameter sweep
      This PR:
      * allows using wandb for running hyperparameter search.
      * The runs are visualized on W&B sweeps dashboard
      * This supports runnning sweeps on parallel devices, all reporting to the same central dashboard.
      
      ### Usage
      **To run new a hyperparameter search:**
      ```
      trainer.hyperparameter_search(
          backend="wandb", 
          project="transformers_sweep", # name of the project
          n_trials=5,
          metric="eval/loss", # metric to be optimized, default 'eval/loss'. A warning is raised if the passed metric is not found
      )
      ```
      This outputs a sweep id. Eg. `my_project/sweep_id`
      
      **To run sweeps on parallel devices:**
      Just pass sweep id which you want to run parallel
      ```
      trainer.hyperparameter_search(
          backend="wandb", 
          sweep_id = "my_project/sweep_id"
      )
      ```
      c74f3d4c
    • Sylvain Gugger's avatar
      Fic docstring of ASR pipeline (#15481) · 13297ac7
      Sylvain Gugger authored
      13297ac7
    • bugface's avatar
      fix error posted in issue #15448 (#15480) · dd360d58
      bugface authored
      
      
      * fix error posted in issue #15448
      Signed-off-by: default avatarbugface <alexgre@ufl.edu>
      
      * clean up - remove commented line
      Signed-off-by: default avatarbugface <alexgre@ufl.edu>
      dd360d58
    • Sylvain Gugger's avatar
      Save code of registered custom models (#15379) · 44b21f11
      Sylvain Gugger authored
      
      
      * Allow dynamic modules to use relative imports
      
      * Work for configs
      
      * Fix last merge conflict
      
      * Save code of registered custom objects
      
      * Map strings to strings
      
      * Fix test
      
      * Add tokenizer
      
      * Rework tests
      
      * Tests
      
      * Ignore fixtures py files for tests
      
      * Tokenizer test + fix collection
      
      * With full path
      
      * Rework integration
      
      * Fix typo
      
      * Remove changes in conftest
      
      * Test for tokenizers
      
      * Add documentation
      
      * Update docs/source/custom_models.mdx
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      
      * Add file structure and file content
      
      * Add more doc
      
      * Style
      
      * Update docs/source/custom_models.mdx
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      
      * Address review comments
      Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
      Co-authored-by: default avatarSuraj Patil <surajp815@gmail.com>
      44b21f11
    • Nicolas Patry's avatar
      Adding support for `microphone` streaming within pipeline. (#15046) · 623d8cb4
      Nicolas Patry authored
      
      
      * Adding support for `microphone` streaming within pipeline.
      
      - Uses `ffmpeg` to get microphone data.
      - Makes sure alignment is made to `size_of_sample`.
      - Works by sending `{"raw": ..data.., "stride": (n, left, right),
      "partial": bool}`
      directly to the pipeline enabling to stream partial results and still
      get inference.
      - Let's `partial` information flow through the pipeline to enable caller
        to get it back and choose to display text or not.
      
      - The striding reconstitution is bound to have errors since CTC does not
      keep previous state. Currently most of the errors are we don't know if
      there's a space or not between two chunks.
      Since we have some left striding info, we could use that during decoding
      to choose what to do with those spaces and even extra letters maybe (if
      the stride is long enough, it's bound to cover at least a few symbols)
      
      Fixing tests.
      
      Protecting with `require_torch`.
      
      `raw_ctc` support for nicer demo.
      
      Post rebase fixes.
      
      Revamp to split raw_mic_data from it's live chunking.
      
      - Requires a refactor to make everything a bit cleaner.
      
      Automatic resampling.
      
      Small fix.
      
      Small fix.
      
      * Post rebase fix (need to let super handle more logic, reorder args.)
      
      * Update docstrings
      
      * Docstring format.
      
      * Remove print.
      
      * Prevent flow of `input_values`.
      
      * Fixing `stride` too.
      
      * Fixing the PR by removing `raw_ctc`.
      
      * Better docstrings.
      
      * Fixing init.
      
      * Update src/transformers/pipelines/audio_utils.py
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      * Update tests/test_pipelines_automatic_speech_recognition.py
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      
      * Quality.
      Co-authored-by: default avatarAnton Lozhkov <aglozhkov@gmail.com>
      623d8cb4
    • Patrick von Platen's avatar
    • NielsRogge's avatar
      Add option to resize like torchvision's Resize (#15419) · 1d94d575
      NielsRogge authored
      * Add torchvision's resize
      
      * Rename torch_resize to default_to_square
      
      * Apply suggestions from code review
      
      * Add support for default_to_square and tuple of length 1
      1d94d575
    • Steven Liu's avatar
      Update tutorial docs (#15165) · b9418a1d
      Steven Liu authored
      * first draft of pipeline, autoclass, preprocess tutorials
      
      * apply review feedback
      
      * 馃枍 apply feedback from patrick/niels
      
      * 馃摑add output image to preprocessed image
      
      * 馃枍 apply feedback from patrick
      b9418a1d
    • Steven Liu's avatar
      Update fine-tune docs (#15259) · c157c7e3
      Steven Liu authored
      * add fine-tune tutorial
      
      * make edits, fix style
      
      * 馃摑 make edits
      
      * 馃枍 fix code format links to external libraries
      
      * 馃攧revert code formatting
      
      * 馃枍 use DefaultDataCollator instead of DataCollatorWithPadding
      c157c7e3
  5. 01 Feb, 2022 7 commits