• Stas Bekman's avatar
    [ported model] FSMT (FairSeq MachineTranslation) (#6940) · 1eeb206b
    Stas Bekman authored
    * ready for PR
    
    * cleanup
    
    * correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST
    
    * fix
    
    * perfectionism
    
    * revert change from another PR
    
    * odd, already committed this one
    
    * non-interactive upload workaround
    
    * backup the failed experiment
    
    * store langs in config
    
    * workaround for localizing model path
    
    * doc clean up as in https://github.com/huggingface/transformers/pull/6956
    
    
    
    * style
    
    * back out debug mode
    
    * document: run_eval.py --num_beams 10
    
    * remove unneeded constant
    
    * typo
    
    * re-use bart's Attention
    
    * re-use EncoderLayer, DecoderLayer from bart
    
    * refactor
    
    * send to cuda and fp16
    
    * cleanup
    
    * revert (moved to another PR)
    
    * better error message
    
    * document run_eval --num_beams
    
    * solve the problem of tokenizer finding the right files when model is local
    
    * polish, remove hardcoded config
    
    * add a note that the file is autogenerated to avoid losing changes
    
    * prep for org change, remove unneeded code
    
    * switch to model4.pt, update scores
    
    * s/python/bash/
    
    * missing init (but doesn't impact the finetuned model)
    
    * cleanup
    
    * major refactor (reuse-bart)
    
    * new model, new expected weights
    
    * cleanup
    
    * cleanup
    
    * full link
    
    * fix model type
    
    * merge porting notes
    
    * style
    
    * cleanup
    
    * have to create a DecoderConfig object to handle vocab_size properly
    
    * doc fix
    
    * add note (not a public class)
    
    * parametrize
    
    * - add bleu scores integration tests
    
    * skip test if sacrebleu is not installed
    
    * cache heavy models/tokenizers
    
    * some tweaks
    
    * remove tokens that aren't used
    
    * more purging
    
    * simplify code
    
    * switch to using decoder_start_token_id
    
    * add doc
    
    * Revert "major refactor (reuse-bart)"
    
    This reverts commit 226dad15ca6a9ef4e26178526e878e8fc5c85874.
    
    * decouple from bart
    
    * remove unused code #1
    
    * remove unused code #2
    
    * remove unused code #3
    
    * update instructions
    
    * clean up
    
    * move bleu eval to examples
    
    * check import only once
    
    * move data+gen script into files
    
    * reuse via import
    
    * take less space
    
    * add prepare_seq2seq_batch (auto-tested)
    
    * cleanup
    
    * recode test to use json instead of yaml
    
    * ignore keys not needed
    
    * use the new -y in transformers-cli upload -y
    
    * [xlm tok] config dict: fix str into int to match definition (#7034)
    
    * [s2s] --eval_max_generate_length (#7018)
    
    * Fix CI with change of name of nlp (#7054)
    
    * nlp -> datasets
    
    * More nlp -> datasets
    
    * Woopsie
    
    * More nlp -> datasets
    
    * One last
    
    * extending to support allen_nlp wmt models
    
    - allow a specific checkpoint file to be passed
    - more arg settings
    - scripts for allen_nlp models
    
    * sync with changes
    
    * s/fsmt-wmt/wmt/ in model names
    
    * s/fsmt-wmt/wmt/ in model names (p2)
    
    * s/fsmt-wmt/wmt/ in model names (p3)
    
    * switch to a better checkpoint
    
    * typo
    
    * make non-optional args such - adjust tests where possible or skip when there is no other choice
    
    * consistency
    
    * style
    
    * adjust header
    
    * cards moved (model rename)
    
    * use best custom hparams
    
    * update info
    
    * remove old cards
    
    * cleanup
    
    * s/stas/facebook/
    
    * update scores
    
    * s/allen_nlp/allenai/
    
    * url maps aren't needed
    
    * typo
    
    * move all the doc / build /eval generators to their own scripts
    
    * cleanup
    
    * Apply suggestions from code review
    Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
    
    * Apply suggestions from code review
    Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
    
    * fix indent
    
    * duplicated line
    
    * style
    
    * use the correct add_start_docstrings
    
    * oops
    
    * resizing can't be done with the core approach, due to 2 dicts
    
    * check that the arg is a list
    
    * style
    
    * style
    Co-authored-by: default avatarSam Shleifer <sshleifer@gmail.com>
    Co-authored-by: default avatarSylvain Gugger <35901082+sgugger@users.noreply.github.com>
    Co-authored-by: default avatarLysandre Debut <lysandre@huggingface.co>
    1eeb206b
index.rst 12.7 KB