1. 08 Aug, 2022 1 commit
  2. 29 Apr, 2022 1 commit
  3. 25 Mar, 2022 1 commit
  4. 23 Mar, 2022 1 commit
    • Sylvain Gugger's avatar
      Reorganize file utils (#16264) · 4975002d
      Sylvain Gugger authored
      * Split file_utils in several submodules
      
      * Fixes
      
      * Add back more objects
      
      * More fixes
      
      * Who exactly decided to import that from there?
      
      * Second suggestion to code with code review
      
      * Revert wront move
      
      * Fix imports
      
      * Adapt all imports
      
      * Adapt all imports everywhere
      
      * Revert this import, will fix in a separate commit
      4975002d
  5. 28 Dec, 2021 1 commit
    • Sylvain Gugger's avatar
      Doc styler examples (#14953) · b5e2b183
      Sylvain Gugger authored
      * Fix bad examples
      
      * Add black formatting to style_doc
      
      * Use first nonempty line
      
      * Put it at the right place
      
      * Don't add spaces to empty lines
      
      * Better templates
      
      * Deal with triple quotes in docstrings
      
      * Result of style_doc
      
      * Enable mdx treatment and fix code examples in MDXs
      
      * Result of doc styler on doc source files
      
      * Last fixes
      
      * Break copy from
      b5e2b183
  6. 27 Dec, 2021 1 commit
    • Sylvain Gugger's avatar
      Doc styler v2 (#14950) · 87e6e4fe
      Sylvain Gugger authored
      * New doc styler
      
      * Fix issue with args at the start
      
      * Code sample fixes
      
      * Style code examples in MDX
      
      * Fix more patterns
      
      * Typo
      
      * Typo
      
      * More patterns
      
      * Do without black for now
      
      * Get more info in error
      
      * Docstring style
      
      * Re-enable check
      
      * Quality
      
      * Fix add_end_docstring decorator
      
      * Fix docstring
      87e6e4fe
  7. 21 Dec, 2021 1 commit
    • Sylvain Gugger's avatar
      Mass conversion of documentation from rst to Markdown (#14866) · 27b3031d
      Sylvain Gugger authored
      * Convert docstrings of all configurations and tokenizers
      
      * Processors and fixes
      
      * Last modeling files and fixes to models
      
      * Pipeline modules
      
      * Utils files
      
      * Data submodule
      
      * All the other files
      
      * Style
      
      * Missing examples
      
      * Style again
      
      * Fix copies
      
      * Say bye bye to rst docstrings forever
      27b3031d
  8. 07 Oct, 2021 1 commit
  9. 01 Jun, 2021 1 commit
  10. 26 Apr, 2021 1 commit
  11. 07 Apr, 2021 1 commit
  12. 31 Mar, 2021 1 commit
  13. 25 Feb, 2021 1 commit
    • Patrick von Platen's avatar
      [PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor,... · cb38ffcc
      Patrick von Platen authored
      [PretrainedFeatureExtractor] + Wav2Vec2FeatureExtractor, Wav2Vec2Processor, Wav2Vec2Tokenizer (#10324)
      
      * push to show
      
      * small improvement
      
      * small improvement
      
      * Update src/transformers/feature_extraction_utils.py
      
      * Update src/transformers/feature_extraction_utils.py
      
      * implement base
      
      * add common tests
      
      * make all tests pass for wav2vec2
      
      * make padding work & add more tests
      
      * finalize feature extractor utils
      
      * add call method to feature extraction
      
      * finalize feature processor
      
      * finish tokenizer
      
      * finish general processor design
      
      * finish tests
      
      * typo
      
      * remove bogus file
      
      * finish docstring
      
      * add docs
      
      * finish docs
      
      * small fix
      
      * correct docs
      
      * save intermediate
      
      * load changes
      
      * apply changes
      
      * apply changes to doc
      
      * change tests
      
      * apply surajs recommend
      
      * final changes
      
      * Apply suggestions from code review
      
      * fix typo
      
      * fix import
      
      * correct docstring
      cb38ffcc
  14. 21 Dec, 2020 1 commit
  15. 20 Nov, 2020 1 commit
    • Quentin Lhoest's avatar
      Fix rag finetuning + add finetuning test (#8585) · 8062fa63
      Quentin Lhoest authored
      * replace init_ddp_connection for index init
      
      * style
      
      * add finetune test
      
      * add test data
      
      * move generate tensors to device
      
      * add test on EM metric
      
      * style
      
      * allow multi process test
      
      * keep gloo process group for retrieval
      
      * add multi-gpu test
      
      * use custom accelerator
      
      * clean test finetune
      
      * minor
      
      * style
      
      * style
      
      * typo
      
      * use python call instead of imported main fumction
      
      * return_dict fix in modeling_rag
      
      * use float32 in retrieval
      
      * store as float32 as well in the custom knowledge dataset example
      
      * style
      
      * rename to finetune_rag
      
      * style
      
      * update readme
      
      * rename utils and callbacks to utils_rag and callbacks_rag
      
      * fix test
      
      * patrick's comments
      
      * generate dummy data in the finetue test script
      
      * remove dummy data files
      
      * style
      8062fa63
  16. 17 Nov, 2020 1 commit
    • Sylvain Gugger's avatar
      Reorganize repo (#8580) · c89bdfbe
      Sylvain Gugger authored
      * Put models in subfolders
      
      * Styling
      
      * Fix imports in tests
      
      * More fixes in test imports
      
      * Sneaky hidden imports
      
      * Fix imports in doc files
      
      * More sneaky imports
      
      * Finish fixing tests
      
      * Fix examples
      
      * Fix path for copies
      
      * More fixes for examples
      
      * Fix dummy files
      
      * More fixes for example
      
      * More model import fixes
      
      * Is this why you're unhappy GitHub?
      
      * Fix imports in conver command
      c89bdfbe
  17. 10 Nov, 2020 1 commit
  18. 29 Oct, 2020 1 commit
  19. 26 Oct, 2020 1 commit
    • Sylvain Gugger's avatar
      Doc styling (#8067) · 08f534d2
      Sylvain Gugger authored
      * Important files
      
      * Styling them all
      
      * Revert "Styling them all"
      
      This reverts commit 7d029395fdae8513b8281cbc2a6c239f8093503e.
      
      * Syling them for realsies
      
      * Fix syntax error
      
      * Fix benchmark_utils
      
      * More fixes
      
      * Fix modeling auto and script
      
      * Remove new line
      
      * Fixes
      
      * More fixes
      
      * Fix more files
      
      * Style
      
      * Add FSMT
      
      * More fixes
      
      * More fixes
      
      * More fixes
      
      * More fixes
      
      * Fixes
      
      * More fixes
      
      * More fixes
      
      * Last fixes
      
      * Make sphinx happy
      08f534d2
  20. 19 Oct, 2020 1 commit
    • Quentin Lhoest's avatar
      Allow Custom Dataset in RAG Retriever (#7763) · 033f29c6
      Quentin Lhoest authored
      * add CustomHFIndex
      
      * typo in config
      
      * update tests
      
      * add custom dataset example
      
      * clean script
      
      * update test data
      
      * minor in test
      
      * docs
      
      * docs
      
      * style
      
      * fix imports
      
      * allow to pass the indexed dataset directly
      
      * update tests
      
      * use multiset DPR
      
      * address thom and patrick's comments
      
      * style
      
      * update dpr tokenizer
      
      * add output_dir flag in use_own_knowledge_dataset.py
      
      * allow custom datasets in examples/rag/finetune.py
      
      * add test for custom dataset in distributed rag retriever
      033f29c6
  21. 05 Oct, 2020 1 commit
  22. 25 Sep, 2020 2 commits
  23. 24 Sep, 2020 1 commit
  24. 22 Sep, 2020 1 commit
    • Ola Piktus's avatar
      RAG (#6813) · c754c41c
      Ola Piktus authored
      * added rag WIP
      
      * path fix
      
      * Formatting / renaming prior to actual work
      
      * added rag WIP
      
      * path fix
      
      * Formatting / renaming prior to actual work
      
      * added rag WIP
      
      * path fix
      
      * Formatting / renaming prior to actual work
      
      * added rag WIP
      
      * Formatting / renaming prior to actual work
      
      * First commit
      
      * improve comments
      
      * Retrieval evaluation scripts
      
      * refactor to include modeling outputs + MPI retriever
      
      * Fix rag-token model + refactor
      
      * Various fixes + finetuning logic
      
      * use_bos fix
      
      * Retrieval refactor
      
      * Finetuning refactoring and cleanup
      
      * Add documentation and cleanup
      
      * Remove set_up_rag_env.sh file
      
      * Fix retrieval wit HF index
      
      * Fix import errors
      
      * Fix quality errors
      
      * Refactor as per suggestions in https://github.com/huggingface/transformers/pull/6813#issuecomment-687208867
      
      
      
      * fix quality
      
      * Fix RAG Sequence generation
      
      * minor cleanup plus initial tests
      
      * fix test
      
      * fix tests 2
      
      * Comments fix
      
      * post-merge fixes
      
      * Improve readme + post-rebase refactor
      
      * Extra dependencied for tests
      
      * Fix tests
      
      * Fix tests 2
      
      * Refactor test requirements
      
      * Fix tests 3
      
      * Post-rebase refactor
      
      * rename nlp->datasets
      
      * RAG integration tests
      
      * add tokenizer to slow integration test and allow retriever to run on cpu
      
      * add tests; fix position ids warning
      
      * change structure
      
      * change structure
      
      * add from encoder generator
      
      * save working solution
      
      * make all integration tests pass
      
      * add RagTokenizer.save/from_pretrained and RagRetriever.save/from_pretrained
      
      * don't save paths
      
      * delete unnecessary imports
      
      * pass config to AutoTokenizer.from_pretrained for Rag tokenizers
      
      * init wiki_dpr only once
      
      * hardcode legacy index and passages paths (todo: add the right urls)
      
      * finalize config
      
      * finalize retriver api and config api
      
      * LegacyIndex index download refactor
      
      * add dpr to autotokenizer
      
      * make from pretrained more flexible
      
      * fix ragfortokengeneration
      
      * small name changes in tokenizer
      
      * add labels to models
      
      * change default index name
      
      * add retrieval tests
      
      * finish token generate
      
      * align test with previous version and make all tests pass
      
      * add tests
      
      * finalize tests
      
      * implement thoms suggestions
      
      * add first version of test
      
      * make first tests work
      
      * make retriever platform agnostic
      
      * naming
      
      * style
      
      * add legacy index URL
      
      * docstrings + simple retrieval test for distributed
      
      * clean model api
      
      * add doc_ids to retriever's outputs
      
      * fix retrieval tests
      
      * finish model outputs
      
      * finalize model api
      
      * fix generate problem for rag
      
      * fix generate for other modles
      
      * fix some tests
      
      * save intermediate
      
      * set generate to default
      
      * big refactor generate
      
      * delete rag_api
      
      * correct pip faiss install
      
      * fix auto tokenization test
      
      * fix faiss install
      
      * fix test
      
      * move the distributed logic to examples
      
      * model page
      
      * docs
      
      * finish tests
      
      * fix dependencies
      
      * fix import in __init__
      
      * Refactor eval_rag and finetune scripts
      
      * start docstring
      
      * add psutil to test
      
      * fix tf test
      
      * move require torch to top
      
      * fix retrieval test
      
      * align naming
      
      * finish automodel
      
      * fix repo consistency
      
      * test ragtokenizer save/load
      
      * add rag model output docs
      
      * fix ragtokenizer save/load from pretrained
      
      * fix tokenizer dir
      
      * remove torch in retrieval
      
      * fix docs
      
      * fixe finetune scripts
      
      * finish model docs
      
      * finish docs
      
      * remove auto model for now
      
      * add require torch
      
      * remove solved todos
      
      * integrate sylvains suggestions
      
      * sams comments
      
      * correct mistake on purpose
      
      * improve README
      
      * Add generation test cases
      
      * fix rag token
      
      * clean token generate
      
      * fix test
      
      * add note to test
      
      * fix attention mask
      
      * add t5 test for rag
      
      * Fix handling prefix in finetune.py
      
      * don't overwrite index_name
      Co-authored-by: default avatarPatrick Lewis <plewis@fb.com>
      Co-authored-by: default avatarAleksandra Piktus <piktus@devfair0141.h2.fair>
      Co-authored-by: default avatarAleksandra Piktus <piktus@learnfair5102.h2.fair>
      Co-authored-by: default avatarAleksandra Piktus <piktus@learnfair5067.h2.fair>
      Co-authored-by: default avatarYour Name <you@example.com>
      Co-authored-by: default avatarPatrick von Platen <patrick.v.platen@gmail.com>
      Co-authored-by: default avatarQuentin Lhoest <lhoest.q@gmail.com>
      c754c41c