• Anthony Chang's avatar
    Standalone softmax kernel (#284) · 15c89e81
    Anthony Chang authored
    * initial stub for standalone softmax
    
    * start device_softmax_mk_to_mk as a wrapper to device_reduce_mk_to_m
    
    * host softmax validates
    
    * compiles; to implement beta scaling
    
    * use NaN trick to efficiently ignore OOB values during sum of exponentials
    
    * freeload device_reduce's utility functions
    
    * clean up interface
    
    * adding prior value (beta scaling)
    
    * remove restriction related to perf considerations
    
    * apply clang-format
    
    * clean; disable diagnostics
    
    * resolve conflicts
    
    * add exp wrapper
    
    * honor HostTensorDesc interface; allow implicit cast from different vector<T> type
    
    * test softmax for fp16/fp32
    
    * update readme
    
    * amend commit NaN trick
    
    * remove redundant param added during development
    
    * format
    
    * replace ScalarDataType with AccDataType
    
    * separate out test programs by precision type
    
    * move softmax sample code to its own folder
    
    * format
    
    * keep up with recent changes in reduction API
    
    * remove extra header
    15c89e81
math.hpp 4.44 KB