Commits · 9e666aaa297a84f8276cd891cd1a151e5266349e · chenpangpang / transformers

16 Apr, 2019 1 commit

Fix gradient overflow issue during attention mask · 9e666aaa

Abhi Sharma authored Apr 16, 2019

This fix is in reference to issue #382. GPT2 can now be trained in mixed precision, which I've confirmed with testing. I also tested unconditional generation on multiple seeds before and after changing 1e10 to 1e4 and there was no difference. Please let me know if there is anything else I can do to make this pull request better. Thanks for all your work!

9e666aaa

15 Apr, 2019 9 commits
- fix openai special tokens loading · d6160224
  thomwolf authored Apr 15, 2019
  
  d6160224
- load all models on cpu · df5d9c35
  thomwolf authored Apr 15, 2019
  
  df5d9c35
- added best practices for serialization in README and examples · 60ea6c59
  thomwolf authored Apr 15, 2019
  
  60ea6c59
- tokenization updates · b3c6ee0a
  thomwolf authored Apr 15, 2019
  
  b3c6ee0a
- add to_json_file method to configuration classes · 9761aa48
  thomwolf authored Apr 15, 2019
  
  9761aa48
- fixing tests · e8568a3b
  thomwolf authored Apr 15, 2019
  
  e8568a3b
- added tokenizers serialization tests · 870b734b
  thomwolf authored Apr 15, 2019
  
  870b734b
- add serialization semantics to tokenizers - fix transfo-xl tokenizer · 3e65f255
  thomwolf authored Apr 15, 2019
  
  3e65f255
- update double head model · fe2756ff
  thomwolf authored Apr 15, 2019
  
  fe2756ff
12 Apr, 2019 2 commits
- Extend the BertForSequenceClassification docs to mention the special CLS token. · 34cf67fd
  Martin Boyanov authored Apr 12, 2019
  
  34cf67fd
- updating loss computation · b509bf76
  thomwolf authored Apr 12, 2019
  
  b509bf76
11 Apr, 2019 5 commits
- back to simple indexing · 1d203a34
  thomwolf authored Apr 11, 2019
  
  1d203a34
- fix OpenAIGPTMultipleChoiceHead · 074c869b
  thomwolf authored Apr 11, 2019
  
  074c869b
- fix typo · a05fad8d
  thomwolf authored Apr 11, 2019
  
  a05fad8d
- update special token addition · 4a82f4f8
  thomwolf authored Apr 11, 2019
  
  4a82f4f8
- fixes #471 · e99b2014
  thomwolf authored Apr 11, 2019
  
  e99b2014
03 Apr, 2019 2 commits
- Should fix #438 · 19666dcb
  thomwolf authored Apr 03, 2019
  
  19666dcb
- Fix #436 · 1d8c2323
  thomwolf authored Apr 03, 2019
  
  1d8c2323
01 Apr, 2019 1 commit
- Fixes to the TensorFlow conversion tool · 8b5c63e4
  Mike Arpaia authored Apr 01, 2019
  
  8b5c63e4
27 Mar, 2019 1 commit
- Remove my unhelpful comments :) · 01520d54
  Catalin Voss authored Mar 27, 2019
  
  01520d54
26 Mar, 2019 1 commit
- Remove padding_idx from position_embeddings and token_type_embeddings · 0401317b
  Ikuya Yamada authored Mar 26, 2019
  
  0401317b
24 Mar, 2019 6 commits
- Fix test failures due to old torch issue with non-contiguous view · fda2f623
  Catalin Voss authored Mar 24, 2019
  
  fda2f623
- Also fix loss function issue with the double head models · 0dd796e3
  Catalin Voss authored Mar 24, 2019
  
  0dd796e3
- Fix typo syntax err (sorry, c/p from my repo) · 472857c4
  Catalin Voss authored Mar 24, 2019
  
  472857c4
- Fix GPT language model loss here as well · 2e6f5ffb
  Catalin Voss authored Mar 24, 2019
  
  2e6f5ffb
- Fix c/p typo from my experiment code · 5938f31f
  Catalin Voss authored Mar 24, 2019
  
  5938f31f
- Fix GPT2 language modeling loss computation · 7797d21b
  Catalin Voss authored Mar 24, 2019
  
  7797d21b
18 Mar, 2019 4 commits
- same · 19cc2c08
  lukovnikov authored Mar 18, 2019
  
  19cc2c08
- import revert · 2283dcca
  lukovnikov authored Mar 18, 2019
  
  2283dcca
- branches, optim cosine fix · ef28b2c7
  lukovnikov authored Mar 18, 2019
  
  ef28b2c7
- branches, optim cosine fix · bed6408d
  lukovnikov authored Mar 18, 2019
  
  bed6408d
14 Mar, 2019 1 commit
- adding absolute imports to gpt2, openai and transfo-xl · e5f2d912
  thomwolf authored Mar 14, 2019
  
  e5f2d912
13 Mar, 2019 1 commit
- relation classification: replacing entity mention with mask token · 20e65220
  lukovnikov authored Mar 13, 2019
  
  20e65220
12 Mar, 2019 4 commits
- changing docker · eac039d2
  lukovnikov authored Mar 12, 2019
  
  eac039d2
- changing docker · 471daf1b
  lukovnikov authored Mar 12, 2019
  
  471daf1b
- changing docker · 90246133
  lukovnikov authored Mar 12, 2019
  
  90246133
- restart cosine lr schedule · baf66d14
  lukovnikov authored Mar 12, 2019
  
  baf66d14
09 Mar, 2019 2 commits
- Make the hyperlink of NVIDIA Apex clickable · f91ce0b8
  Bharat Raghunathan authored Mar 09, 2019
  
  f91ce0b8
- cos fix · 51efde54
  lukovnikov authored Mar 09, 2019
  
  51efde54