- 22 Jun, 2022 1 commit
-
-
Ted Themistokleous authored
Updated each source file in the repo with the existing license.
-
- 12 Apr, 2022 1 commit
-
-
Paul Fultz II authored
out-of-bounds access when generate uses nonpacked tensors and add some additional asserts for gpu memory.
-
- 05 Aug, 2021 1 commit
-
-
Paul Fultz II authored
* Add method to compile pointwise * Formatting * Add lambda * Add semicolon * Rename variable * Add driver to run jit kernels * Formatting * Add context * Formatting * Make seperate driver folder * Add more general gpu driver * Formatting * Print out wll time * Formatting * Run multiple times and skip first run * Formatting * Seperate time_op * Run an op for comparison * Formatting * Add debug asserts * Formatting * Change parameer name * Formatting * Fix argument order * Formatting * Add preloading * Formatting * Allow a different data type * Formatting * Pipeline transformations * Formatting * Add vectorization * Formatting * Reduce dims * Formatting * Compile with launch params as constant * Formatting * Make sure buffer can be vecotrized * Formatting * Enable vectorization and preloading * Formatting * Add print header * Formatting * Avoid allocating to large of LDS * Formatting * Add some vec functions to a seperate header * Formatting * Add stride loops * Formatting * Improve the transform pipeline * Formatting * Add const * Fix shape check * Formatting * Just check stride axis is zero * Remove extra finc_vector_axis overload * Simplify some mroe functions * Formatting * Remove some more extra functions * Formatting * Simplify more decltypes * Add another const * Fix test * Get buffer pointer different for older compilers Co-authored-by:
Shucai Xiao <shucai@gmail.com> Co-authored-by:
Chris Austen <causten@users.noreply.github.com>
-
- 05 Apr, 2021 1 commit
-
-
Shucai Xiao authored
* code cleanup * clang format * backup code * clang format * remove unnecessary code * clang format * add module print function * code backup * refine the module::print function * refine the module:to_value() function * code backup * backup code changes * code backup * remove to_value and from_value function from the module class * rename a function * rename the if operator * refine the if operator * refine the print function of module and program * code backup * code backup * fix a build warning * fix overload of compute_shape function * code backup * fix unit test error * fix cppcheck error * fix the issue related to the overload of compute_shape * fix review comments * fix cppcheck error * change the return name of if_op to be if * clang format * fix two unit tests * clang format * rename variables * clang format * remove the unused compute_op function * clang format * add lowering of if operator and compute_op function * clang format * add parsing if operator in onnx file * clang format * fix clang tidy format * clang format * add the gpu implementation of the if operator * enhance the validate function and uncomment a unit test * clang format * remove unnecessary code * add sub_module processing in ref passes * clang format * clang format * fix a hang issue related to the valid function * fix an issue in replace_refs * clang format * fix review comments * clang format * fix cppcheck error * clang format * add a unit test for more code coverage * clang format * fix review comments and add test for more code coverage * clang format * fix cppcheck error * clang format * fix cppcheck error * fix a cppcheck error * clang format * backup code * clang format * fix cppcheck error * clang format * some code refinement * clang format * code backup to handle submodules in module compilation * clang format * code backup * clang format * code backup * clang format * fix a bug related to literal id * fix a bug in gpu execution * change the way of compiling a graph * clang format * backup more changes * clang format * refine pass log information * remove unnecessary code * clang format * temp changes backup * clang format * add module name prefix to scratch memory id in hip_memory_allocation * clang format * change to copy the cond input by inserting a copy instruction * clang format * change to use the if output argument as the submodule output so can remove a gpu_copy * clang format * consider submodule in some compile passes * clang format * fix review comments * clang format * fix issues related to scratch memory * clang format * remove unnecessary code * fix cppcheck error * clang format * reslove the implicit dependencies issue related to submodule * clang format * fix cppcheck error * clang format * backup temp changes * clang format * fixed an bug in the has_instruction function * clang format * fix the return value of the gpu implementation of the if operator * fix a bug in the compute_shape function in the gpu implementation * add an if onnx unit test * clang format * add more unit tests * clang format * tmp code backup * clang format * fix a sync problem related to copy cond argument from gpu to cpu * clang format * change the compile offload copy flag setting * clang format * enable copy from cpu to be able to do synchronous copy * clang format * add more unit tests * add more unit tests * add more ref unit tests * clang format * fixed a bug error * tmp code backup * clang format * fixed an onnx verify unit test * add more unit tests * clang format * reverse a change * fix cppcheck error * fix cppcheck error * fix to print all instructions in program execution * clang format * fix bugs related to memory coloring and offload copy to be true * clang format * remove unnecessary include header file * sort test cases in ref_cpu_ops alphabetically * clang format * add a flag to disable cpu target in verification test * change the way to disable some tests * clang format * disable verify unit test of the if operators * add a function call to have more code coverage * fix a build error * fix review comments * fix review comments * clang format * add a api gpu unit test for more code coverage * clang format * change to use instruction.size() as node index * move the calc_implicit_deps function to module class as a member function * clang format * move the offload_copy flag setting to lowering * clang format * assign the module_eval lambda function to a variable to simplify code * clang format * move the compute function from ref/gpu implementation to the main if operator * clang format * fix cpp check error * add a unit test for more code coverage * clang format * add unit test to calculate implicit deps * add a python unit test * clang format * refine a unit test to have more code coverage * clang format * chang the way of wrap up arguments for sub modules * clang format * fix some build errors * code cleanup * refine unit tests to have more code coverage * clang format * refine unit test to have more code coverage * code backup * clang format * add memory coloring test * refine memory coloring unit test * clang format * remove an unnecessary line * remove an unused line * remove an unnecessary parameter in the lambda function * clang format * refine a unit test * remove an unnecessary line * refine unit tests to have more code coverage * clang format * combine two lines * add one more unit test for more code coverage * clang format * add one more unit test * clang format * fix review comments * refine a print out information * fix review comments * clang format * change the sync copy to using a gpu device sync * clang format * remove unnecessary code Co-authored-by:mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 08 Feb, 2021 1 commit
-
-
Paul Fultz II authored
* Add eliminate_data_type pass * Formatting * Auto convert quant ops * Formatting * Flip the order of decompose * Compute max size differently * Formatting * Clamp values in convert * Formatting * Fix loss of precision in reduce * Formatting * Fix bugs in reduction * Fix accumulator type in reference softmax implementation * Formatting * Update convert test * Remove unused variables * Remove unnecessary quant_dot check * Formatting * Add tests * Formatting * Remove unused code * Remove duplicate ops * Remove blaze dependency * Use set since shape::type_t is no hashable on gcc 5 * Formatting Co-authored-by:
Shucai Xiao <shucai@gmail.com> Co-authored-by:
mvermeulen <5479696+mvermeulen@users.noreply.github.com>
-
- 27 Aug, 2020 1 commit
-
-
Shucai Xiao authored
* add bool type * code backup * code backup * clang format * fix build warnings * clang format * add the equal operator * add the equal operator * clang format * remove unnecessary code * refine unit tests * clang format * fix review comments and a bug * clang format * additional changes * clang format * fix cppcheck error * add bool type in c api * fix cppcheck error * fix review comments * fix cppcheck error * fix a build error related to gcc * fix cppcheck error * fix cppcheck error * added the equal operator to register list * add parsing boolean type * clang format * fix bool type issue for python output * clang format * add support for automatic multibroadcast of the equal operator * additional unit tests for more code coverage * clang format * missing an onnx file Co-authored-by:Paul Fultz II <pfultz2@yahoo.com>
-
- 15 Nov, 2019 1 commit
-
-
Paul Fultz II authored
* Add compiler options * Add copy operators * Formatting * Use run_passes in tests * Formatting * Use run_pass in schedule test * Formatting * Add compile_options to get_passes in target * Formatting * Offload copy option * Formatting * Copy using pinned memory * Formatting * Improve performance of gpu copying * Formatting * Dont copy * Formatting * Always make an extra copy * Formatting * Remove unused write op * Add missing include * Remove copy_to_gpu function in python api * Make offload copy disabled by default on C++ * Formatting * Fix tidy issues * Formatting * Fix namespace * Fix python tests * Turn clang format off since its broken * Fix compile error on gcc 5 * Remove commented code
-
- 19 Aug, 2019 2 commits
- 06 Aug, 2019 1 commit
-
-
Shucai Xiao authored
-
- 02 Aug, 2019 2 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 17 Jul, 2019 2 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 09 Jul, 2019 1 commit
-
-
Shucai Xiao authored
-
- 29 Jun, 2019 1 commit
-
-
Shucai Xiao authored
-
- 23 May, 2019 1 commit
-
-
Shucai Xiao authored
-
- 02 May, 2019 1 commit
-
-
Paul authored
-
- 27 Nov, 2018 1 commit
-
-
Paul authored
-
- 14 Nov, 2018 1 commit
-
-
Paul authored
-
- 06 Nov, 2018 4 commits
-
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
Shucai Xiao authored
-
- 31 Oct, 2018 1 commit
-
-
Paul authored
-
- 17 Sep, 2018 1 commit
-
-
Paul authored
-
- 30 Aug, 2018 1 commit
-
-
Paul authored
-
- 28 Aug, 2018 4 commits
- 22 Aug, 2018 2 commits
- 21 Aug, 2018 3 commits
- 20 Aug, 2018 4 commits