1. 21 Jul, 2020 1 commit
  2. 08 Jul, 2020 1 commit
  3. 22 Jun, 2020 1 commit
  4. 19 Jun, 2020 1 commit
  5. 29 May, 2020 1 commit
  6. 28 May, 2020 1 commit
    • Reed Wanderman-Milne's avatar
      Use float32 activation in Transformer. · 94b1efc1
      Reed Wanderman-Milne authored
      Float32 is used if the model uses mixed precision with bfloat16. Float16 activation are unchanged.
      
      The motivation is that BERT with the LAMB optimizer with a gelu activation has an unstable loss when gelu is in bfloat16. Unfortunately, it is not easy to check if the LAMB optimizer and gelu is used, and perhaps there are other cases that work better with float32 activations instead of bfloat16 activations, so we always do the activation in float32 instead of bfloat16.
      
      PiperOrigin-RevId: 313618322
      94b1efc1
  7. 12 May, 2020 2 commits
  8. 10 May, 2020 1 commit
  9. 05 May, 2020 1 commit
  10. 21 Apr, 2020 1 commit
  11. 01 Apr, 2020 1 commit
  12. 27 Mar, 2020 1 commit
  13. 09 Mar, 2020 1 commit
  14. 03 Mar, 2020 2 commits
  15. 26 Feb, 2020 2 commits
  16. 25 Feb, 2020 1 commit
  17. 21 Feb, 2020 1 commit
  18. 08 Feb, 2020 1 commit
  19. 21 Jan, 2020 1 commit
    • Hongkun Yu's avatar
      Remove compute_output_shape. · 0e0a94a6
      Hongkun Yu authored
      Keras: "manual" shape inference is only required if the layer is dynamic (otherwise we use TF's static shape inference capabilities)
      
      PiperOrigin-RevId: 290821518
      0e0a94a6
  20. 20 Nov, 2019 1 commit
  21. 13 Nov, 2019 1 commit
  22. 11 Nov, 2019 1 commit
    • Hongkun Yu's avatar
      Release keras bert: · f1d35b4e
      Hongkun Yu authored
      - Update classifier example.
      - Add new converted checkpoints.
      - Update benchmark,
      
      PiperOrigin-RevId: 279762797
      f1d35b4e