Commits · 0b7d053c136e3947eb4c1efc91f2ccae15af1d1e · chenpangpang / transformers

11 Nov, 2021 1 commit
- Fixing requirements for TF LM models and use correct model mappings (#14372) · 7f20bf0d
  Matt authored Nov 11, 2021
```
* Fixing requirements for TF LM models and use correct model mappings

* make style
```
  7f20bf0d
21 Oct, 2021 1 commit
- Replace "Masked" with "Causal" in TF CLM example (#14014) · f9c16b02
  Christopher Akiki authored Oct 21, 2021
  
  f9c16b02
31 Aug, 2021 1 commit
- Fixed CLM model still using MODEL_FOR_MASKED_LM_MAPPING (#13002) · 702f4a49
  Matt authored Aug 31, 2021
  
  702f4a49
28 Aug, 2021 1 commit

examples: only use keep_linebreaks when reading TXT files (#13320) · 4046e66e

Stefan Schweter authored Aug 28, 2021

* examples: only use keep_linebreaks when reading TXT files for all CLM examples

* examples: only use keep_linebreaks when reading TXT files for all CLM examples

* examples: only use keep_linebreaks when reading TXT files for all CLM examples

4046e66e

27 Aug, 2021 1 commit

examples: add keep_linebreaks option to CLM examples (#13150) · 319d840b

Stefan Schweter authored Aug 27, 2021

* examples: add keep_linebreaks option to text dataset loader for all CLM examples

* examples: introduce new keep_linebreaks option as data argument in CLM examples

319d840b

28 Jul, 2021 1 commit

Correct validation_split_percentage argument from int (ex:5) to float (0.05) (#12897) · f3d0866e

Elysium1436 authored Jul 27, 2021



* Fixed train_test_split test_size argument

* `Seq2SeqTrainer` set max_length and num_beams only when non None  (#12899)

* set max_length and num_beams only when non None

* fix instance variables

* fix code style

* [FLAX] Minor fixes in CLM example (#12914)

* readme: fix retrieval of vocab size for flax clm example

* examples: fix flax clm example when using training/evaluation files

* Fix module path for symbolic_trace example
Co-authored-by: cchen-dialpad <47165889+cchen-dialpad@users.noreply.github.com>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Sylvain Gugger <sylvain.gugger@gmail.com>

f3d0866e

08 Jul, 2021 1 commit
- Fix group_lengths for short datasets (#12558) · 6f1adc43
  Sylvain Gugger authored Jul 08, 2021
  
  6f1adc43
01 Jul, 2021 1 commit

Validation split added: custom data files @sgugger, @patil-suraj (#12407) · d5b8fe3b

Souvic Chakraborty authored Jul 01, 2021



* Validation split added: custom data files

Validation split added in case of no validation file and loading custom data

* Updated documentation with custom file usage

Updated documentation with custom file usage

* Update README.md

* Update README.md

* Update README.md

* Made some suggested stylistic changes

* Used logger instead of print.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Made similar changes to add validation split

In case of a missing validation file, a validation split will be used now.

* max_train_samples to be used for training only

max_train_samples got misplaced, now corrected so that it is applied on training data only, not whole data.

* styled

* changed ordering

* Improved language of documentation
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Improved language of documentation
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Fixed styling issue

* Update run_mlm.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

d5b8fe3b

28 Jun, 2021 1 commit

Tensorflow LM examples (#12358) · 7e22609e

Matt authored Jun 28, 2021

* Tensorflow MLM example

* Add CLM example

* Style fixes, adding missing checkpoint code from the CLM example

* Fix TPU training, avoid massive dataset warnings

* Fix incorrect training length calculation for multi-GPU training

* Fix incorrect training length calculation for multi-GPU training

* Refactors and nitpicks from the review

* Style pass

* Adding README

7e22609e