• Ao Tang's avatar
    Add Nemotron HF Support (#31699) · 6a03942d
    Ao Tang authored
    * Add nemotron support
    
    * fix inference
    
    * add unit test
    
    * add layernorm1p as a class to avoid meta device mismatch
    
    * test fixed
    
    * Add copied_from statements
    
    * remove pretraining_tp args
    
    * remove nemotronlayernorm
    
    * force LN computation done in FP32
    
    * remove nemotrontokenizer and use llamatokenizer
    
    * license update
    
    * add option for kv_channels for minitron8b
    
    * remove assert
    
    * o_proj fixed
    
    * o_proj reshape
    
    * add gated_proj option
    
    * typo
    
    * remove todos
    
    * fix broken test after merging latest main
    
    * remove nezha/nat after meging main
    
    * chnage default config to 15b model
    
    * add nemo conversion script
    
    * rename conversion script
    
    * remove gate_proj option
    
    * pr comment resolved
    
    * fix unit test
    
    * rename kv_channels to head_dim
    
    * resolve PR issue
    
    * add nemotron md
    
    * fix broken tests
    
    * refactor rope for nemotron
    
    * test fix
    
    * remove linearscaling
    
    * whitespace and import
    
    * fix some copied-from
    
    * code style fix
    
    * reformatted
    
    * add position_embedding to nemotronattention
    
    * rope refactor to only use config, copied-from fix
    
    * format
    
    * Run make fix-copies
    
    * nemotron md with autodoc
    
    * doc  fix
    
    * fix order
    
    * pass check_config_docstrings.py
    
    * fix config_attributes
    
    * remove all llama BC related code
    
    * Use PreTrainedTokenizerFast
    
    * ruff check examples
    
    * conversion script update
    
    * add nemotron to toctree
    6a03942d
_toctree.yml 26.4 KB