• Neta Zmora's avatar
    Improve softmax ONNX export tests (#370) · a0f44354
    Neta Zmora authored
    
    
    * Add dynamically shaped input mask in test_export_softmax
    * Fix test_softmax_mask_fn - use env. var `NVTE_ONNX_KVCACHE_MAX_SEQ_LEN` to control whether the test uses the default mask generation function or dynamic TRILU mask slicing.
    * Change core_attention ONNX export test: use "no_mask" as attn mask type when testing `te.attention.DotProductAttention` w/o masking.
    * Use ORT CUDA backend by default.
    Signed-off-by: default avatarNeta Zmora <nzmora@nvidia.com>
    a0f44354
test_onnx_export.py 62.1 KB