• Benjamin Badger's avatar
    Extend save_pretrained to offloaded models (#27412) · ff689f57
    Benjamin Badger authored
    
    
    * added hidden subset
    
    * debugged hidden subset contrastive search
    
    * added contrastive search compression
    
    * debugged compressed contrastive search
    
    * memory reduction for contrastive search
    
    * debugged mem red
    
    * added low memory option feature
    
    * debugged mem optmimization output stack
    
    * debugged mem optmimization output stack
    
    * debugged low mem
    
    * added low mem cache
    
    * fixed 2047 tensor view
    
    * debugged 2042 past key val inputs
    
    * reformatted tensors
    
    * changed low mem output
    
    * final clean
    
    * removed subset hidden csearch
    
    * fixed hidden device
    
    * fixed hidden device
    
    * changed compressor dtype
    
    * removed hstate compression
    
    * integrated csearch in generate
    
    * test csearch integration into generation
    
    exit()
    
    * fixed csearch kwarg integration with generation
    
    * final wrap and added doc
    
    * Update src/transformers/generation/utils.py
    Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
    
    * Update src/transformers/generation/utils.py
    Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
    
    * Update src/transformers/generation/utils.py
    Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
    
    * added debug print
    
    * direct hstate cat
    
    * direct hstate cat
    
    * direct hstate cat debug
    
    * direct hstate cat debug
    
    * expanded full hidden state stack
    
    * expanded full hidden state stack
    
    * matched dims for hstates
    
    * matched dims for hstates
    
    * logits fix
    
    * equality test
    
    * equality hidden debug
    
    * debug
    
    * added prints for debug
    
    * added prints for debug
    
    * equality check
    
    * switched squeeze dim
    
    * input format debug
    
    * tracing top_k_ids
    
    * removed trace
    
    * added test context
    
    * added jitter
    
    * added jitter
    
    * added jitter
    
    * returned state
    
    * rebuilt past key value reconstruction
    
    * debugged
    
    * cleaned traces
    
    * added selection for pkv
    
    * changed output to dict
    
    * cleaned
    
    * cleaned
    
    * cleaned up contrastive search test
    
    * moved low_memory kwarg
    
    * debugged
    
    * changed low mem test batch size to 1
    
    * removed output
    
    * debugged test input shape
    
    * reformatted csearch test
    
    * added trace
    
    * removed unsqueeze on final forward pass
    
    * replaced unsqueeze with view
    
    * removed traces
    
    * cleaned
    
    * debugged model kwargs
    
    * removed special models from test
    
    * ran make quality
    
    * Update src/transformers/generation/configuration_utils.py
    Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
    
    * Update src/transformers/generation/configuration_utils.py
    Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
    
    * refactored
    
    * refactored
    
    * refactored
    
    * make fixup
    
    * renamed flag sequential
    
    * renamed flag sequential
    
    * iterative onloading
    
    * black style and test utils
    
    * added traces for integrated test
    
    * debugged
    
    * added traces
    
    * make style
    
    * removed traces, make style
    
    * included suggestions and added test
    
    * debugged test
    
    * added offload module check and make style
    
    * is_accelerate_available and make style
    
    * added test decorator
    
    * changed test model and config spec
    
    * added offload condition
    
    * added lazy loading for each shard
    
    * debugged
    
    * modified sharding
    
    * debugged
    
    * added traces
    
    * removed safe serialization
    
    * no index overload;
    
    * trace on safe save ptrs
    
    * added ptr condition
    
    * debugged
    
    * debugged ptr
    
    * moved module map init
    
    * remake shard only for offloaded modules
    
    * refactored
    
    * debugged
    
    * refactored
    
    * debugged
    
    * cleaned and make style
    
    * cleaned and make style
    
    * added trace
    
    * sparse module map
    
    * debugged
    
    * removed module map conditional
    
    * refactored
    
    * debug
    
    * debugged
    
    * added traces
    
    * added shard mem trace
    
    * added shard mem trace
    
    * removed underlying storage check
    
    * refactored
    
    * memory leak removal and make style
    
    * cleaned
    
    * swapped test decs and make style
    
    * added mem checks and make style
    
    * added free mem warning
    
    * implemented some suggestions
    
    * moved onloading to accelerate
    
    * refactored for accelerate integration
    
    * cleaned test
    
    * make style
    
    * debugged offload map name
    
    * cleaned and make style
    
    * replaced meta device check for sharding
    
    * cleaned and make style
    
    * implemented some suggestions
    
    * more suggestions
    
    * update warning
    Co-authored-by: default avatarMarc Sun <57196510+SunMarc@users.noreply.github.com>
    
    * more suggestions
    
    * make style
    
    * new make style
    
    * Update src/transformers/modeling_utils.py
    Co-authored-by: default avatarMarc Sun <57196510+SunMarc@users.noreply.github.com>
    
    * Update src/transformers/modeling_utils.py
    Co-authored-by: default avatarMarc Sun <57196510+SunMarc@users.noreply.github.com>
    
    * Update src/transformers/modeling_utils.py
    Co-authored-by: default avatarMarc Sun <57196510+SunMarc@users.noreply.github.com>
    
    * Update src/transformers/modeling_utils.py
    Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
    
    ---------
    Co-authored-by: default avatarJoao Gante <joaofranciscocardosogante@gmail.com>
    Co-authored-by: default avatarMarc Sun <57196510+SunMarc@users.noreply.github.com>
    Co-authored-by: default avataramyeroberts <22614925+amyeroberts@users.noreply.github.com>
    ff689f57
test_modeling_utils.py 99.9 KB