1. 14 Dec, 2020 1 commit
    • Paul Fultz II's avatar
      Use dnnl for cpu backend (#688) · 406afeb8
      Paul Fultz II authored
      
      
      * Add flag to enable cpu backend
      
      * Make buffers shared
      
      * Enable optimizations
      
      * Add onednn
      
      * Formatting
      
      * Formatting
      
      * Add dnnl header
      
      * Formatting
      
      * Rewrite rnn first
      
      * Formatting
      
      * Call reference implementation
      
      * Formatting
      
      * Make literal data shared
      
      * Formatting
      
      * Add convolution
      
      * Formatting
      
      * Compensate for dilation
      
      * Formatting
      
      * Use name/make_op instead
      
      * Formatting
      
      * Rename gemm header
      
      * Formatting
      
      * Add dnnl convolution/gemm operators
      
      * Formatting
      
      * Add eliminate_contiguous
      
      * Add faster pointwise operators
      
      * Formatting
      
      * Formatting
      
      * Formatting
      
      * Add dnnl op class
      
      * Formatting
      
      * Add add op
      
      * Formatting
      
      * Add concat operator
      
      * Formatting
      
      * Add more ops
      
      * Create descriptor during finalization
      
      * Formatting
      
      * Dont rewrite pooling
      
      * Enable memory coloring
      
      * Formatting
      
      * Add output aliases
      
      * Formatting
      
      * Fix errors
      
      * Formatting
      
      * Convert literals
      
      * Add missing file
      
      * Remove batch_norm
      
      * Formatting
      
      * Use strides
      
      * Formatting
      
      * Add some debug checks
      
      * Formatting
      
      * Fix big in adjusting shape for gemm
      
      * Formatting
      
      * Fix fallback dot operator
      
      * Zero initialize buffers
      
      * Add suport for group convolutions
      
      * Formatting
      
      * Make adjust allocation target independent
      
      * Formatting
      
      * Enable adjust_allocation for gpu/cpu
      
      * Formatting
      
      * Add copy to allocation model
      
      * Formatting
      
      * Add copy operator
      
      * Formatting
      
      * Better handling of output parameters in adjust_allocation
      
      * Formatting
      
      * Build with dnnl
      
      * Make dnnl required
      
      * Fix compile error
      
      * Tidy fixes
      
      * Formatting
      
      * Tidy fixes
      
      * Formatting
      
      * Fix more tidy issues
      
      * Formatting
      
      * Add mul op
      
      * Add mul op
      
      * Set c compiler to clang as well
      
      * Compensate for normalized compute shape
      
      * Formatting
      
      * Fix cppcheck errors
      
      * Formatting
      
      * Add onednn library to hcc
      
      * Guard clang pragmas
      
      * Disable cpu mode for gcc for now
      
      * Leave it enabled it for gcc 7
      
      * Fix cppcheck suppresion
      
      * Fix compile error on gcc 5
      
      * Remove unused code
      Co-authored-by: default avatarShucai Xiao <shucai.xiao@amd.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      406afeb8
  2. 16 Nov, 2020 1 commit
    • Shucai Xiao's avatar
      Normalize ops (#667) · 8443ecd1
      Shucai Xiao authored
      
      
      * add a pass to normalize ops
      
      * clang format
      
      * add unit tests
      
      * clang format
      
      * code backup
      
      * clang format
      
      * code backup
      
      * clang format
      
      * add support for slice in the normalize_op function
      
      * clang format
      
      * add operation method api for whether we need to call normalize_op
      
      * clang format
      
      * fix review comments
      
      * clang format
      
      * rename a function namejJ
      
      * clang format
      
      * change compute_shape to normalize_compute_shape for corresponding operators
      
      * clang format
      
      * remove unnecessary code
      
      * fix various issues
      
      * clang format
      
      * add attributes to operators having axis attributes
      
      * clang format
      
      * fixed jenkins build error
      
      * clang format
      
      * fix a bug related to slice
      
      * clang format
      
      * code backup
      
      * clang format
      
      * code backup
      
      * clang format
      
      * rename a file
      
      * fix cppcheck error
      
      * some code refinement
      
      * clang format
      
      * change attributes to enum
      
      * clang format
      
      * refine the enum
      
      * clang format
      
      * remove unnecessary code
      
      * add unit tests for more code coverage and fixed a bug
      
      * clang format
      
      * remove unnecessary changes
      
      * change normalize_axes to normalize
      
      * clang format
      
      * revert back the changes in broadcast.hpp
      
      * rename normalize_axes to normalize
      
      * fix review comments
      
      * clang format
      
      * Add flag to enable cpu backend
      
      * Make buffers shared
      
      * Enable optimizations
      
      * Formatting
      
      * Try to avoid ambiguous assign in value class
      
      * fixed a build error
      
      * clang format
      
      * add the normalize_ops pass to the ref target
      
      * refactor program to module to normalize_ops pass
      Co-authored-by: default avatarPaul <pfultz2@yahoo.com>
      Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
      8443ecd1
  3. 11 Nov, 2020 1 commit
  4. 04 Nov, 2020 1 commit
    • Paul Fultz II's avatar
      Split cpu and reference implementation (#671) · 500d9441
      Paul Fultz II authored
      
      
      * Add all_targets cmake target
      
      * Rename target
      
      * Add ref target
      
      * Rename tests
      
      * Refactor compiler target
      
      * Formatting
      
      * Verify for every target
      
      * Formatting
      
      * Add verify test suite
      
      * Formatting
      
      * Add initial test programs
      
      * Formatting
      
      * Add rnn tests
      
      * Formatting
      
      * Validate gpu
      
      * Formatting
      
      * Remove old gpu tests
      
      * Fix gpu tests
      
      * Fix ref error
      
      * Fix tidy issues
      
      * Formatting
      
      * Tidy fixes
      
      * Fix header in python api
      
      * Rename to ref
      
      * Use ref in verify_onnx
      
      * Fix tidy issue
      
      * Build with verbose on
      
      * Fix typo
      
      * Remove verbose
      
      * rename some cpu prefix to ref
      Co-authored-by: default avatarShucai Xiao <Shucai.Xiao@amd.com>
      500d9441