1. 03 Dec, 2019 1 commit
    • Myle Ott's avatar
      v0.8.0 -> v0.9.0 (#1452) · df2f84ce
      Myle Ott authored
      Summary:
      Possibly breaking changes:
      - Set global numpy seed (4a7cd582)
      - Split `in_proj_weight` into separate k, v, q projections in MultiheadAttention (fdf4c3e9)
      - TransformerEncoder returns namedtuples instead of dict (27568a7e)
      
      New features:
      - Add `--fast-stat-sync` option (e1ba32aa)
      - Add `--empty-cache-freq` option (315c463d)
      - Support criterions with parameters (ba5f829f)
      
      New papers:
      - Simple and Effective Noisy Channel Modeling for Neural Machine Translation (49177c99)
      - Levenshtein Transformer (86857a58, ...)
      - Cross+Self-Attention for Transformer Models (4ac2c5f2)
      - Jointly Learning to Align and Translate with Transformer Models (1c667929)
      - Reducing Transformer Depth on Demand with Structured Dropout (dabbef46)
      - Unsupervised Cross-lingual Representation Learning at Scale (XLM-RoBERTa) (e23e5eaa)
      - BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension (a92bcdad)
      - CamemBERT: a French BERT (b31849aa)
      
      Speed improvements:
      - Add CUDA kernels for LightConv and DynamicConv (f840564d)
      - Cythonization of various dataloading components (4fc39538, ...)
      - Don't project mask tokens for MLM training (718677eb)
      Pull Request resolved: https://github.com/pytorch/fairseq/pull/1452
      
      Differential Revision: D18798409
      
      Pulled By: myleott
      
      fbshipit-source-id: 860a0d5aaf7377c8c9bd63cdb3b33d464f0e1727
      df2f84ce
  2. 19 Aug, 2019 1 commit
  3. 15 Aug, 2019 1 commit
  4. 14 Aug, 2019 1 commit