- 29 May, 2020 8 commits
-
-
Hongkun Yu authored
Proposes the full functionality of MultiHeadAttention layer. This change first goes to model garden NLP library. PiperOrigin-RevId: 313847485
-
Chen Chen authored
PiperOrigin-RevId: 313812017
-
Pengchong Jin authored
PiperOrigin-RevId: 313806578
-
Yeqing Li authored
PiperOrigin-RevId: 313798102
-
Pengchong Jin authored
PiperOrigin-RevId: 313794704
-
A. Unique TensorFlower authored
PiperOrigin-RevId: 313711327
-
Yeqing Li authored
PiperOrigin-RevId: 313708781
-
A. Unique TensorFlower authored
PiperOrigin-RevId: 313693539
-
- 28 May, 2020 4 commits
-
-
Abdullah Rashwan authored
PiperOrigin-RevId: 313662797
-
Hongkun Yu authored
Deprecate old customized training loop for run_classifier.py as compile/fit fully satisfy needs/performance. PiperOrigin-RevId: 313660745
-
Reed Wanderman-Milne authored
Float32 is used if the model uses mixed precision with bfloat16. Float16 activation are unchanged. The motivation is that BERT with the LAMB optimizer with a gelu activation has an unstable loss when gelu is in bfloat16. Unfortunately, it is not easy to check if the LAMB optimizer and gelu is used, and perhaps there are other cases that work better with float32 activations instead of bfloat16 activations, so we always do the activation in float32 instead of bfloat16. PiperOrigin-RevId: 313618322
-
A. Unique TensorFlower authored
PiperOrigin-RevId: 313536026
-
- 27 May, 2020 4 commits
-
-
Pengchong Jin authored
PiperOrigin-RevId: 313475975
-
A. Unique TensorFlower authored
PiperOrigin-RevId: 313435294
-
Allen Wang authored
PiperOrigin-RevId: 313321531
-
Hongkun Yu authored
PiperOrigin-RevId: 313313579
-
- 26 May, 2020 5 commits
-
-
Allen Wang authored
PiperOrigin-RevId: 313259032
-
Hongkun Yu authored
PiperOrigin-RevId: 313205490
-
Maxim Neumann authored
PiperOrigin-RevId: 313148142
-
André Susano Pinto authored
This allows one to finetune a BERT model into a task before using it for another task. E.g. SQuAD before finetune another QA type of tasks. PiperOrigin-RevId: 313145768
-
Hongkun Yu authored
PiperOrigin-RevId: 313125068
-
- 25 May, 2020 1 commit
-
-
A. Unique TensorFlower authored
PiperOrigin-RevId: 313030129
-
- 24 May, 2020 3 commits
-
-
Hongkun Yu authored
PiperOrigin-RevId: 312988874
-
Hongkun Yu authored
PiperOrigin-RevId: 312945782
-
Hongkun Yu authored
PiperOrigin-RevId: 312939899
-
- 23 May, 2020 2 commits
-
-
Hongkun Yu authored
PiperOrigin-RevId: 312923051
-
Hongkun Yu authored
PiperOrigin-RevId: 312889153
-
- 22 May, 2020 2 commits
-
-
Hongkun Yu authored
PiperOrigin-RevId: 312841381
-
Yeqing Li authored
PiperOrigin-RevId: 312770722
-
- 21 May, 2020 9 commits
-
-
Hongkun Yu authored
PiperOrigin-RevId: 312765926
-
A. Unique TensorFlower authored
PiperOrigin-RevId: 312754139
-
A. Unique TensorFlower authored
PiperOrigin-RevId: 312751112
-
Jared T Nielsen authored
-
Hongkun Yu authored
Transformer Encoder: when embedding width differs from hidden size, add a projection to hidden size. PiperOrigin-RevId: 312708922
-
A. Unique TensorFlower authored
PiperOrigin-RevId: 312626203
-
A. Unique TensorFlower authored
PiperOrigin-RevId: 312624602
-
A. Unique TensorFlower authored
PiperOrigin-RevId: 312624281
-
Hongkun Yu authored
PiperOrigin-RevId: 312596462
-
- 20 May, 2020 2 commits
-
-
Hongkun Yu authored
PiperOrigin-RevId: 312560339
-
Hongkun Yu authored
PiperOrigin-RevId: 312515585
-