1. 18 Aug, 2020 9 commits
  2. 17 Aug, 2020 28 commits
  3. 16 Aug, 2020 2 commits
  4. 14 Aug, 2020 1 commit
    • Jin Young (Daniel) Sohn's avatar
      Fix TPU Convergence bug introduced by PR#6151 (#6488) · 24107c2c
      Jin Young (Daniel) Sohn authored
      Currently with the bug introduced we're taking two optimizer steps per
      batch: one global one, where `xm.optimizer_step` injects a CRS between
      all cores in training, and one without. This has been affecting training
      accuracy (for example, XLNet GLUE on MNLI is not converging, etc.).
      24107c2c