1. 27 Nov, 2024 1 commit
  2. 31 May, 2023 1 commit
  3. 12 Aug, 2022 1 commit
  4. 11 Aug, 2022 1 commit
    • Po Yen Chen's avatar
      Add examples for GEMM + AddAddFastGelu (data type: int8, bf16, fp32) (#340) · 68b61504
      Po Yen Chen authored
      * Add always_false<> util to delay symbol resolution
      
      * Use always_false<> to prevent trying instantiate unwanted method
      
      * Add new specializations of AddAddFastGelu::operator() method
      
      * Add GEMM + AddAddFastGelu examples for data types: int8, bf16, fp32
      
      * Use floating point literal to simplify code
      
      * Remove unnecessary capture in lambda expressions
      
      * Extract fast GeLU calculation as standalone method
      
      * Mark methods as 'constexpr'
      
      * Add constraint for HostTensorDescriptor templated ctors
      
      * Simplify HostTensorDescriptor ctor calls
      
      * Add C++23 std::size_t literal suffix
      
      * Use _uz suffix to shorten example code
      
      * Remove unnecessary conversion to std::array<>
      
      * Re-order include directives
      
      * Remove C-style casting by literal suffix
      
      * Remove unnecessary statements in main()
      
      * Remove unused type parameter of always_false<>
      
      * Remove unused include directive
      
      * Exit main() by returning meaningful value
      
      * Use 'if constexpr' to switch example flow
      
      * Use std::is_same_v<> to shorten example code
      
      * Add 'inline' specifier to literal functions
      
      * Unify output methods in example
      
      * Move common codes into .inc file
      
      * Add type check in type_convert<>()
      
      * Add type_convert<float>() before computation
      
      * Merge AddAddFastGelu method specializations
      
      * Remove always_false<>
      
      * Add constraint to AddAddFastGelu::operator() parameter types
      68b61504