"git@developer.sourcefind.cn:gaoqiong/migraphx.git" did not exist on "c3e02b18af53125fb36ff11a7802d850590e7cee"
Unverified Commit 41c0487b authored by Shucai Xiao's avatar Shucai Xiao Committed by GitHub
Browse files

Module build exec (#765)



* code cleanup

* clang format

* backup code

* clang format

* remove unnecessary code

* clang format

* add module print function

* code backup

* refine the module::print function

* refine the module:to_value() function

* code backup

* backup code changes

* code backup

* remove to_value and from_value function from the module class

* rename a function

* rename the if operator

* refine the if operator

* refine the print function of module and program

* code backup

* code backup

* fix a build warning

* fix overload of compute_shape function

* code backup

* fix unit test error

* fix cppcheck error

* fix the issue related to the overload of compute_shape

* fix review comments

* fix cppcheck error

* change the return name of if_op to be if

* clang format

* fix two unit tests

* clang format

* rename variables

* clang format

* remove the unused compute_op function

* clang format

* add lowering of if operator and compute_op function

* clang format

* add parsing if operator in onnx file

* clang format

* fix clang tidy format

* clang format

* add the gpu implementation of the if operator

* enhance the validate function and uncomment a unit test

* clang format

* remove unnecessary code

* add sub_module processing in ref passes

* clang format

* clang format

* fix a hang issue related to the valid function

* fix an issue in replace_refs

* clang format

* fix review comments

* clang format

* fix cppcheck error

* clang format

* add a unit test for more code coverage

* clang format

* fix review comments and add test for more code coverage

* clang format

* fix cppcheck error

* clang format

* fix cppcheck error

* fix a cppcheck error

* clang format

* backup code

* clang format

* fix cppcheck error

* clang format

* some code refinement

* clang format

* code backup to handle submodules in module compilation

* clang format

* code backup

* clang format

* code backup

* clang format

* fix a bug related to literal id

* fix a bug in gpu execution

* change the way of compiling a graph

* clang format

* backup more changes

* clang format

* refine pass log information

* remove unnecessary code

* clang format

* temp changes backup

* clang format

* add module name prefix to scratch memory id in hip_memory_allocation

* clang format

* change to copy the cond input by inserting a copy instruction

* clang format

* change to use the if output argument as the submodule output so can remove a gpu_copy

* clang format

* consider submodule in some compile passes

* clang format

* fix review comments

* clang format

* fix issues related to scratch memory

* clang format

* remove unnecessary code

* fix cppcheck error

* clang format

* reslove the implicit dependencies issue related to submodule

* clang format

* fix cppcheck error

* clang format

* backup temp changes

* clang format

* fixed an bug in the has_instruction function

* clang format

* fix the return value of the gpu implementation of the if operator

* fix a bug in the compute_shape function in the gpu implementation

* add an if onnx unit test

* clang format

* add more unit tests

* clang format

* tmp code backup

* clang format

* fix a sync problem related to copy cond argument from gpu to cpu

* clang format

* change the compile offload copy flag setting

* clang format

* enable copy from cpu to be able to do synchronous copy

* clang format

* add more unit tests

* add more unit tests

* add more ref unit tests

* clang format

* fixed a bug error

* tmp code backup

* clang format

* fixed an onnx verify unit test

* add more unit tests

* clang format

* reverse a change

* fix cppcheck error

* fix cppcheck error

* fix to print all instructions in program execution

* clang format

* fix bugs related to memory coloring and offload copy to be true

* clang format

* remove unnecessary include header file

* sort test cases in ref_cpu_ops alphabetically

* clang format

* add a flag to disable cpu target in verification test

* change the way to disable some tests

* clang format

* disable verify unit test of the if operators

* add a function call to have more code coverage

* fix a build error

* fix review comments

* fix review comments

* clang format

* add a api gpu unit test for more code coverage

* clang format

* change to use instruction.size() as node index

* move the calc_implicit_deps function to module class as a member function

* clang format

* move the offload_copy flag setting to lowering

* clang format

* assign the module_eval lambda function to a variable to simplify code

* clang format

* move the compute function from ref/gpu implementation to the main if operator

* clang format

* fix cpp check error

* add a unit test for more code coverage

* clang format

* add unit test to calculate implicit deps

* add a python unit test

* clang format

* refine a unit test to have more code coverage

* clang format

* chang the way of wrap up arguments for sub modules

* clang format

* fix some build errors

* code cleanup

* refine unit tests to have more code coverage

* clang format

* refine unit test to have more code coverage

* code backup

* clang format

* add memory coloring test

* refine memory coloring unit test

* clang format

* remove an unnecessary line

* remove an unused line

* remove an unnecessary parameter in the lambda function

* clang format

* refine a unit test

* remove an unnecessary line

* refine unit tests to have more code coverage

* clang format

* combine two lines

* add one more unit test for more code coverage

* clang format

* add one more unit test

* clang format

* fix review comments

* refine a print out information

* fix review comments

* clang format

* change the sync copy to using a gpu device sync

* clang format

* remove unnecessary code
Co-authored-by: default avatarmvermeulen <5479696+mvermeulen@users.noreply.github.com>
parent 5d601ad1
......@@ -48,7 +48,9 @@ void dead_code_elimination::apply(module& p) const
continue;
assert(bidistance(p, i, last) > 0);
fix([&](auto self, auto leaf) {
assert(p.has_instruction(leaf));
if(not p.has_instruction(leaf))
return;
if(leaf->outputs().empty())
{
std::unordered_set<instruction_ref> args(leaf->inputs().begin(),
......
......@@ -18,9 +18,19 @@ argument generate_argument(shape s, unsigned long seed)
{
argument result;
s.visit_type([&](auto as) {
using type = typename decltype(as)::type;
auto v = generate_tensor_data<type>(s, seed);
result = {s, v};
// we use char type to store bool type internally, so bool_type
// needs special processing to generate data
if(s.type() == shape::bool_type)
{
auto v = generate_tensor_data<bool>(s, seed);
result = {s, v};
}
else
{
using type = typename decltype(as)::type;
auto v = generate_tensor_data<type>(s, seed);
result = {s, v};
}
});
return result;
}
......
......@@ -30,7 +30,9 @@ constexpr T normalize(unsigned long z)
return half_max - (z % max);
}
template <class T, MIGRAPHX_REQUIRES(not is_signed<T>{} and std::is_integral<T>{})>
template <class T,
MIGRAPHX_REQUIRES(not is_signed<T>{} and std::is_integral<T>{} and
not std::is_same<T, bool>{})>
constexpr T normalize(unsigned long z)
{
const auto max = 1UL << (sizeof(T) * 5);
......
......@@ -4,6 +4,7 @@
#include <migraphx/functional.hpp>
#include <migraphx/ranges.hpp>
#include <migraphx/instruction.hpp>
#include <migraphx/module.hpp>
#include <migraphx/program.hpp>
#include <migraphx/iterator_for.hpp>
#include <migraphx/type_name.hpp>
......
......@@ -24,6 +24,7 @@ const operation& get_operation(instruction_ref ins);
struct module_impl;
using parameter_map = std::unordered_map<std::string, argument>;
using ins_dep_map = std::unordered_map<instruction_ref, std::unordered_set<instruction_ref>>;
/**
* @brief Stores the instruction stream
......@@ -129,7 +130,7 @@ struct module
void debug_print() const;
void debug_print(instruction_ref ins) const;
void debug_print(instruction_ref ins,
const std::unordered_map<instruction_ref, std::string>& names) const;
std::unordered_map<instruction_ref, std::string>& names) const;
void debug_print(const std::vector<instruction_ref>& inss) const;
std::unordered_map<instruction_ref, std::string> print(
......@@ -149,8 +150,8 @@ struct module
void annotate(std::ostream& os, std::function<void(instruction_ref)> a) const;
std::vector<module_ref> get_sub_modules() const;
module& sort();
ins_dep_map calc_implicit_deps() const;
friend std::ostream& operator<<(std::ostream& os, const module& m);
friend bool operator==(const module& x, const module& y);
......@@ -158,6 +159,10 @@ struct module
private:
void assign(const module& m);
void calc_implicit_deps(const module& smod,
const module& pmod,
instruction_ref ins,
ins_dep_map& deps) const;
std::unique_ptr<module_impl> impl;
};
......
......@@ -9,6 +9,7 @@
#include <migraphx/module.hpp>
#include <cmath>
#include <utility>
#include <set>
namespace migraphx {
inline namespace MIGRAPHX_INLINE_NS {
......@@ -20,7 +21,7 @@ struct if_op
shape compute_shape(const std::vector<shape>& inputs, std::vector<module_ref> mods) const
{
check_shapes{inputs, *this}.has(1).standard();
check_shapes{inputs, *this}.standard();
if(mods.size() != 2)
{
MIGRAPHX_THROW("IF: operator should have two submodules.");
......@@ -36,6 +37,34 @@ struct if_op
return out_shapes0.front();
}
argument compute(
const std::vector<argument>& args,
const std::vector<module_ref>& mods,
const std::function<std::vector<argument>(
module_ref& mdl, const std::unordered_map<std::string, argument>& inputs)>& run) const
{
auto cond = args.front().at<bool>();
module_ref mod = cond ? mods[0] : mods[1];
std::unordered_map<std::string, argument> params;
std::set<std::string> pnames;
for(const auto& smod : mods)
{
auto names = smod->get_parameter_names();
pnames.insert(names.begin(), names.end());
}
assert(pnames.size() < args.size());
std::transform(pnames.begin(),
pnames.end(),
args.begin() + 1,
std::inserter(params, params.end()),
[](auto&& name, auto&& arg) { return std::make_pair(name, arg); });
auto results = run(mod, params);
return results[0];
}
};
} // namespace op
......
......@@ -7,6 +7,7 @@
#include <memory>
#include <type_traits>
#include <utility>
#include <unordered_map>
#include <migraphx/reflect.hpp>
#include <migraphx/streamutils.hpp>
#include <migraphx/normalize_attributes.hpp>
......@@ -237,6 +238,33 @@ argument compute_op(const T& x, const shape& output_shape, const std::vector<arg
return compute_op(rank<2>{}, x, output_shape, input);
}
template <class T, class F>
auto compute_op(rank<1>,
const T& x,
const std::vector<argument>& inputs,
const std::vector<module_ref>& module_args,
F f) -> decltype(x.compute(inputs, module_args, f))
{
return x.compute(inputs, module_args, f);
}
template <class T, class F>
argument
compute_op(rank<0>, const T& x, const std::vector<argument>&, const std::vector<module_ref>&, F)
{
std::string name = x.name();
MIGRAPHX_THROW("Not computable: " + name);
}
template <class T, class F>
argument compute_op(const T& x,
const std::vector<argument>& inputs,
const std::vector<module_ref>& module_args,
F f)
{
return compute_op(rank<1>{}, x, inputs, module_args, f);
}
template <class T>
auto is_context_free_op(rank<1>,
const T& x,
......@@ -350,9 +378,12 @@ void from_value_op(T& x, const value& v)
* shape compute_shape(const std::vector<shape>& inputs,const std::vector<module_ref>&
* mod_args) const; argument compute(context& ctx,const shape& output,const std::vector<argument>&
* input) const; argument compute(const shape& output,const std::vector<argument>& input)
* const; value to_value() const; void from_value(const value& v) ; value attributes() const;
* friend std::ostream & operator<<(std::ostream & os,const operation & op) ;
* friend bool operator==(const operation & x,const operation & y) ;
* const; argument compute(const std::vector<argument>& input,const std::vector<module_ref>&
* module_args,std::function<std::vector<argument>(module_ref& mdl, const
* std::unordered_map<std::string, argument>& inputs)> run) const; value to_value() const; void
* from_value(const value& v) ; value attributes() const; friend std::ostream &
* operator<<(std::ostream & os,const operation & op) ; friend bool operator==(const operation &
* x,const operation & y) ;
* };
*
*/
......@@ -481,6 +512,16 @@ struct operation
return (*this).private_detail_te_get_handle().compute(output, input);
}
argument compute(
const std::vector<argument>& input,
const std::vector<module_ref>& module_args,
std::function<std::vector<argument>(
module_ref& mdl, const std::unordered_map<std::string, argument>& inputs)> run) const
{
assert((*this).private_detail_te_handle_mem_var);
return (*this).private_detail_te_get_handle().compute(input, module_args, std::move(run));
}
value to_value() const
{
assert((*this).private_detail_te_handle_mem_var);
......@@ -537,11 +578,17 @@ struct operation
virtual argument
compute(context& ctx, const shape& output, const std::vector<argument>& input) const = 0;
virtual argument compute(const shape& output, const std::vector<argument>& input) const = 0;
virtual value to_value() const = 0;
virtual void from_value(const value& v) = 0;
virtual value attributes() const = 0;
virtual std::ostream& operator_shift_left(std::ostream& os) const = 0;
virtual bool operator==(const operation& y) const = 0;
virtual argument
compute(const std::vector<argument>& input,
const std::vector<module_ref>& module_args,
std::function<std::vector<argument>(
module_ref& mdl, const std::unordered_map<std::string, argument>& inputs)> run)
const = 0;
virtual value to_value() const = 0;
virtual void from_value(const value& v) = 0;
virtual value attributes() const = 0;
virtual std::ostream& operator_shift_left(std::ostream& os) const = 0;
virtual bool operator==(const operation& y) const = 0;
};
template <class T>
......@@ -697,6 +744,31 @@ struct operation
return detail::compute_op(private_detail_te_self, output, input);
}
template <class T>
static auto private_detail_te_default_compute(
char,
T&& private_detail_te_self,
const std::vector<argument>& input,
const std::vector<module_ref>& module_args,
std::function<std::vector<argument>(
module_ref& mdl, const std::unordered_map<std::string, argument>& inputs)> run)
-> decltype(private_detail_te_self.compute(input, module_args, std::move(run)))
{
return private_detail_te_self.compute(input, module_args, std::move(run));
}
template <class T>
static argument private_detail_te_default_compute(
float,
T&& private_detail_te_self,
const std::vector<argument>& input,
const std::vector<module_ref>& module_args,
std::function<std::vector<argument>(
module_ref& mdl, const std::unordered_map<std::string, argument>& inputs)> run)
{
return detail::compute_op(private_detail_te_self, input, module_args, std::move(run));
}
template <class T>
static auto private_detail_te_default_to_value(char, T&& private_detail_te_self)
-> decltype(private_detail_te_self.to_value())
......@@ -829,6 +901,18 @@ struct operation
char(0), private_detail_te_value, output, input);
}
argument
compute(const std::vector<argument>& input,
const std::vector<module_ref>& module_args,
std::function<std::vector<argument>(
module_ref& mdl, const std::unordered_map<std::string, argument>& inputs)> run)
const override
{
return private_detail_te_default_compute(
char(0), private_detail_te_value, input, module_args, std::move(run));
}
value to_value() const override
{
......
......@@ -99,6 +99,7 @@ struct program
const module* get_main_module() const;
std::vector<const module*> get_modules() const;
std::vector<module*> get_modules();
private:
void assign(const program& p);
......
......@@ -133,7 +133,8 @@ const std::vector<instruction_ref>& instruction::outputs() const { return output
bool operator==(const instruction& x, const instruction& y)
{
if(std::tie(x.result, x.op, x.arguments) != std::tie(y.result, y.op, y.arguments))
if(std::tie(x.result, x.op, x.arguments, x.module_args) !=
std::tie(y.result, y.op, y.arguments, y.module_args))
return false;
if(x.name() == "@literal")
return x.lit == y.lit;
......
......@@ -273,10 +273,7 @@ instruction_ref module::add_parameter(std::string name, shape s)
instruction_ref module::add_return(std::vector<instruction_ref> args)
{
assert(std::all_of(
args.begin(), args.end(), [&](instruction_ref x) { return has_instruction(x); }) &&
"Argument is not an exisiting instruction");
impl->instructions.push_back({builtin::returns{}, {}, args});
impl->instructions.push_back({builtin::returns{}, {}, std::move(args)});
auto result = std::prev(impl->instructions.end());
instruction::backreference(result);
assert(result->valid(begin()));
......@@ -298,6 +295,7 @@ shape module::get_parameter_shape(std::string name) const
}
});
if(ins != this->end())
return ins->get_shape();
else
return {};
......@@ -354,18 +352,10 @@ std::unordered_map<std::string, shape> module::get_parameter_shapes() const
bool module::has_instruction(instruction_ref ins) const
{
if(std::find_if(
impl->instructions.begin(), impl->instructions.end(), [&](const instruction& x) {
return std::addressof(*ins) == std::addressof(x);
}) != impl->instructions.end())
{
return true;
}
auto parent_modules = get_sub_modules();
return std::any_of(parent_modules.begin(), parent_modules.end(), [&](auto mod) {
return mod->has_instruction(ins);
});
return std::find_if(
impl->instructions.begin(), impl->instructions.end(), [&](const instruction& x) {
return std::addressof(*ins) == std::addressof(x);
}) != impl->instructions.end();
}
std::size_t module::size() const { return impl->instructions.size(); }
......@@ -427,7 +417,7 @@ void module::finalize(context& ctx)
void module::debug_print() const { std::cout << *this << std::endl; }
void module::debug_print(instruction_ref ins,
const std::unordered_map<instruction_ref, std::string>& names) const
std::unordered_map<instruction_ref, std::string>& names) const
{
if(ins == this->end())
{
......@@ -440,7 +430,7 @@ void module::debug_print(instruction_ref ins,
return;
}
std::stringstream ss;
this->print(
names = this->print(
[&](auto x, auto ins_names) {
if(x == ins)
{
......@@ -479,7 +469,9 @@ std::unordered_map<instruction_ref, std::string> module::print(
}
else
{
var_name = this->name() + ":@" + std::to_string(count);
var_name = this->name();
var_name.append((this->name().empty() ? "@" : ":@"));
var_name.append(std::to_string(count));
count++;
}
names.emplace(ins, var_name);
......@@ -676,6 +668,55 @@ module& module::sort()
return *this;
}
void module::calc_implicit_deps(const module& smod,
const module& pmod,
instruction_ref ins,
ins_dep_map& deps) const
{
const auto& ins_inputs = ins->inputs();
for(auto ii : iterator_for(smod))
{
const auto& ii_inputs = ii->inputs();
for(auto iii : ii_inputs)
{
if(pmod.has_instruction(iii))
{
if(not contains(ins_inputs, iii))
deps[ins].insert(iii);
}
}
const auto& mod_args = ii->module_inputs();
if(not mod_args.empty())
{
for(const auto* ssmod : mod_args)
{
calc_implicit_deps(*ssmod, pmod, ins, deps);
}
}
}
}
ins_dep_map module::calc_implicit_deps() const
{
ins_dep_map mod_implicit_deps;
for(auto ins : iterator_for(*this))
{
const auto& mod_args = ins->module_inputs();
if(mod_args.empty())
{
continue;
}
for(const auto* mod : mod_args)
{
calc_implicit_deps(*mod, *this, ins, mod_implicit_deps);
}
}
return mod_implicit_deps;
}
bool operator==(const module& x, const module& y) { return to_string(x) == to_string(y); }
std::ostream& operator<<(std::ostream& os, const module& m)
......
......@@ -38,6 +38,10 @@ struct onnx_parser
instruction_ref add_instruction(const operation& op,
const std::vector<instruction_ref>& args) const;
instruction_ref add_instruction(const operation& op,
const std::vector<instruction_ref>& args,
const std::vector<module_ref>& mods) const;
template <class... Ts>
instruction_ref add_instruction(const operation& op, Ts... xs) const
{
......
......@@ -143,6 +143,13 @@ onnx_parser::node_info::add_instruction(const operation& op,
return mod->add_instruction(op, args);
}
instruction_ref onnx_parser::node_info::add_instruction(const operation& op,
const std::vector<instruction_ref>& args,
const std::vector<module_ref>& mods) const
{
return mod->add_instruction(op, args, mods);
}
instruction_ref onnx_parser::node_info::add_literal(literal l) const
{
return mod->add_literal(std::move(l));
......@@ -283,8 +290,9 @@ void onnx_parser::parse_graph(module* mod, const onnx::GraphProto& graph)
}
else
{
result = ops[node.op_type()](
*this, {get_attributes(node), output_num, node.op_type(), mod}, args);
std::string node_name = node.op_type() + "_" + std::to_string(mod->size());
result = ops[node.op_type()](
*this, {get_attributes(node), output_num, node_name, mod}, args);
}
output_num = std::min<std::size_t>(output_num, result.size());
......
......@@ -18,41 +18,67 @@ struct parse_if : op_parser<parse_if>
const onnx_parser::node_info& info,
std::vector<instruction_ref> args) const
{
migraphx::argument cond_arg = args.front()->eval();
// cond is not constant, need to create sub_modules
if(cond_arg.empty())
{
MIGRAPHX_THROW(
"PARSE_IF: current implementation requires condition input to be constant!");
}
const auto& then_graph = info.attributes.at("then_branch").g();
const auto& else_graph = info.attributes.at("else_branch").g();
if(cond_arg.get_shape().elements() != 1)
if(args.front()->get_shape().elements() != 1)
{
MIGRAPHX_THROW("PARSE_IF: condition input can have only one element!");
}
auto* mod = info.mod;
// then branch
if(cond_arg.at<bool>())
migraphx::argument cond_arg = args.front()->eval();
// cond is not constant, need to create sub_modules
if(cond_arg.empty())
{
const auto& then_graph = info.attributes.at("then_branch").g();
parser.parse_graph(mod, then_graph);
std::string then_name = info.name + "_if";
module_ref then_mdl = parser.prog.create_module(then_name);
std::string else_name = info.name + "_else";
module_ref else_mdl = parser.prog.create_module(else_name);
// parse the then sub_graph
parser.parse_graph(then_mdl, then_graph);
// parse_the else sub_graph
parser.parse_graph(else_mdl, else_graph);
auto then_out_shapes = then_mdl->get_output_shapes();
auto else_out_shapes = else_mdl->get_output_shapes();
if(not std::equal(then_out_shapes.begin(),
then_out_shapes.end(),
else_out_shapes.begin(),
else_out_shapes.end()))
{
MIGRAPHX_THROW("PARSE_IF: then and else sub_grahps must have same output shapes!");
}
auto ret = info.add_instruction(make_op("if"), args, {then_mdl, else_mdl});
return {ret};
}
// else branch
else
{
const auto& else_graph = info.attributes.at("else_branch").g();
parser.parse_graph(mod, else_graph);
}
auto* mod = info.mod;
// then branch
if(cond_arg.at<bool>())
{
parser.parse_graph(mod, then_graph);
}
// else branch
else
{
parser.parse_graph(mod, else_graph);
}
// inputs of the return instruction are that of the output of the
// if instruction
instruction_ref ret_ins = std::prev(mod->end());
auto outputs = ret_ins->inputs();
assert(ret_ins->name() == "@return");
mod->remove_instruction(ret_ins);
// inputs of the return instruction are that of the output of the
// if instruction
instruction_ref ret_ins = std::prev(mod->end());
auto outputs = ret_ins->inputs();
assert(ret_ins->name() == "@return");
mod->remove_instruction(ret_ins);
return outputs;
return outputs;
}
}
};
......
......@@ -9,8 +9,11 @@ inline namespace MIGRAPHX_INLINE_NS {
void memory_coloring_impl::run()
{
// calc implicit depdendencies
mod_implicit_deps = p_mod->calc_implicit_deps();
MIGRAPHX_DEBUG(dump("---Before memory coloring---"));
MIGRAPHX_DEBUG(dump_program());
MIGRAPHX_DEBUG(dump_module());
build();
if(num_of_lives != 0)
{
......@@ -22,7 +25,10 @@ void memory_coloring_impl::run()
allocate(interval);
alloc_queue.pop();
}
// rewrite happens after all modules are processed
rewrite();
if(enable_verify)
verify();
}
......@@ -99,13 +105,13 @@ bool memory_coloring_impl::allocate(interval_ptr interval)
void memory_coloring_impl::build()
{
std::size_t num_of_instrs = p_program->size();
std::size_t num_of_instrs = p_mod->size();
if(num_of_instrs == 0)
return;
auto cur_points = num_of_instrs * 2;
instruction_ref iter = p_program->end();
instruction_ref begin = p_program->begin();
instruction_ref iter = p_mod->end();
instruction_ref begin = p_mod->begin();
std::vector<instruction_ref> dead_instrs;
std::set<int> live_set;
// Build live intervals.
......@@ -137,8 +143,19 @@ void memory_coloring_impl::build()
{
is_dead = true;
}
for(auto&& arg : iter->inputs())
auto inputs = iter->inputs();
if(contains(mod_implicit_deps, iter))
{
const auto& impl_deps = mod_implicit_deps.at(iter);
inputs.insert(inputs.end(), impl_deps.begin(), impl_deps.end());
}
for(auto&& arg : inputs)
{
if(not p_mod->has_instruction(arg))
continue;
if(is_param(arg) || is_outline(arg))
{
if(is_output_param(arg))
......@@ -185,8 +202,8 @@ void memory_coloring_impl::rewrite()
std::vector<std::size_t> dims;
dims.push_back((required_bytes + sizeof(float) - 1) / sizeof(float));
shape s = {shape::float_type, dims};
instruction_ref scratch_param = p_program->add_parameter("scratch", s);
for(auto ins : iterator_for(*p_program))
instruction_ref scratch_param = p_mod->add_parameter("scratch", s);
for(auto ins : iterator_for(*p_mod))
{
const instruction* p_iter = &(*ins);
if(instr2_live.find(p_iter) != instr2_live.end())
......@@ -210,7 +227,7 @@ void memory_coloring_impl::rewrite()
if(is_allocate(ins))
{
p_program->replace_instruction(
p_mod->replace_instruction(
ins,
make_op("load", {{"shape", to_value(ins->get_shape())}, {"offset", offset}}),
scratch_param);
......@@ -218,7 +235,7 @@ void memory_coloring_impl::rewrite()
}
}
MIGRAPHX_DEBUG(dump("---After rewrite---"));
MIGRAPHX_DEBUG(dump_program());
MIGRAPHX_DEBUG(dump_module());
}
void memory_coloring_impl::verify()
......@@ -262,7 +279,7 @@ void memory_coloring_impl::verify()
void memory_coloring_impl::dump(const std::string& str) { std::cout << str << std::endl; }
void memory_coloring_impl::dump_program() { std::cout << *p_program << std::endl; }
void memory_coloring_impl::dump_module() { std::cout << *p_mod << std::endl; }
void memory_coloring_impl::dump_intervals()
{
......
......@@ -5,6 +5,7 @@
#include <migraphx/instruction.hpp>
#include <migraphx/iterator_for.hpp>
#include <migraphx/pass_config.hpp>
#include <migraphx/ranges.hpp>
#include <migraphx/config.hpp>
#include <set>
......@@ -68,7 +69,7 @@ using interval_ptr = live_interval*;
struct memory_coloring_impl
{
memory_coloring_impl(module* p, std::string alloc_op, bool p_verify)
: p_program(p), allocation_op(std::move(alloc_op)), enable_verify(p_verify)
: p_mod(p), allocation_op(std::move(alloc_op)), enable_verify(p_verify)
{
instr2_live.clear();
live_ranges.clear();
......@@ -80,6 +81,7 @@ struct memory_coloring_impl
latest_end_point = -1;
unify_literals = false;
}
bool allocate(interval_ptr);
void add_conflicts(const std::set<int>& live_set, int val)
{
......@@ -97,7 +99,11 @@ struct memory_coloring_impl
static bool is_param(const instruction_ref ins) { return ins->name() == "@param"; }
static bool is_output_param(const instruction_ref ins)
{
return is_param(ins) && any_cast<builtin::param>(ins->get_operator()).parameter == "output";
if(not is_param(ins))
return false;
auto param_name = any_cast<builtin::param>(ins->get_operator()).parameter;
return contains(param_name, "#output_");
}
bool is_allocate(const instruction_ref ins) const { return ins->name() == allocation_op; }
static bool is_outline(const instruction_ref ins) { return ins->name() == "@outline"; }
......@@ -118,7 +124,7 @@ struct memory_coloring_impl
void verify();
#ifdef MIGRAPHX_DEBUG_OPT
void dump(const std::string&);
void dump_program();
void dump_module();
void dump_intervals();
#endif
struct ordering
......@@ -145,7 +151,8 @@ struct memory_coloring_impl
return (i1->offset > i2->offset);
}
};
module* p_program;
module* p_mod;
std::unordered_map<const instruction*, interval_ptr> instr2_live;
// universe of live intervals.
std::vector<live_interval> live_intervals;
......@@ -167,6 +174,8 @@ struct memory_coloring_impl
bool unify_literals;
std::string allocation_op{};
bool enable_verify;
ins_dep_map mod_implicit_deps;
};
} // namespace MIGRAPHX_INLINE_NS
......
......@@ -19,7 +19,7 @@ void run_passes(module& modl, const std::vector<pass>& passes, tracer trace)
{
for(const auto& p : passes)
{
trace("Pass: ", p.name());
trace("Module: ", modl.name(), ", Pass: ", p.name());
p.apply(modl);
trace(modl);
......
......@@ -143,19 +143,23 @@ void program::compile(const target& t, compile_options options)
options.trace(*this);
options.trace();
auto mods = this->get_modules();
std::reverse(mods.begin(), mods.end());
auto&& passes = t.get_passes(this->impl->ctx, options);
auto* modl = get_main_module();
assert(modl->validate() == modl->end());
run_passes(*modl, passes, options.trace);
auto invalid = this->validate();
if(invalid != modl->end())
for(const auto& mod : mods)
{
auto index = std::distance(modl->begin(), invalid);
MIGRAPHX_THROW("Invalid module " + modl->name() + " from compilation at instruction " +
std::to_string(index));
assert(mod->validate() == mod->end());
run_passes(*mod, passes, options.trace);
auto invalid = mod->validate();
if(invalid != mod->end())
{
MIGRAPHX_THROW("Invalid module " + mod->name() + " from compilation at instruction " +
std::to_string(std::distance(mod->begin(), invalid)));
}
mod->finalize(this->impl->ctx);
}
modl->finalize(this->impl->ctx);
}
void program::finalize()
......@@ -165,17 +169,17 @@ void program::finalize()
}
template <class F>
std::vector<argument> generic_eval(const module& p,
std::vector<argument> generic_eval(const module* mod,
context& ctx,
std::unordered_map<std::string, argument> params,
std::unordered_map<instruction_ref, argument> results,
F trace)
{
assert(p.validate() == p.end());
std::unordered_map<instruction_ref, argument> results;
results.reserve(p.size() * 2);
assert(mod->validate() == mod->end());
results.reserve(mod->size() * 2);
std::vector<argument> values;
values.reserve(16);
for(auto ins : iterator_for(p))
for(auto ins : iterator_for(*mod))
{
const auto& name = ins->name();
if(name == "@literal")
......@@ -221,15 +225,32 @@ std::vector<argument> generic_eval(const module& p,
assert(results.find(i) != results.end());
return results[i];
});
results.emplace(ins, trace(ins, [&] {
return ins->normalized_operator().compute(
ctx, ins->get_shape(), values);
}));
const auto& mod_args = ins->module_inputs();
auto module_eval = [&](module_ref smod,
const std::unordered_map<std::string, argument>& inputs) {
return generic_eval(smod, ctx, inputs, results, trace);
};
if(not mod_args.empty())
{
results.emplace(ins, trace(ins, [&] {
return ins->normalized_operator().compute(
values, mod_args, module_eval);
}));
}
else
{
results.emplace(ins, trace(ins, [&] {
return ins->normalized_operator().compute(
ctx, ins->get_shape(), values);
}));
}
}
assert(results.find(ins) != results.end());
}
return {results.at(std::prev(p.end()))};
return {results.at(std::prev(mod->end()))};
}
template <class F>
......@@ -238,8 +259,8 @@ std::vector<argument> generic_eval(const program& p,
std::unordered_map<std::string, argument> params,
F trace)
{
const auto* mm = p.get_main_module();
return generic_eval(*mm, ctx, params, trace);
const module* mm = p.get_main_module();
return generic_eval(mm, ctx, params, {}, trace);
}
std::vector<argument> program::eval(parameter_map params) const
......@@ -590,8 +611,7 @@ void program::print(
{
for(const auto& mod : this->impl->modules)
{
std::cout << mod.name() << ":" << std::endl;
mod.print(print_func, names);
names = mod.print(print_func, names);
}
}
......@@ -664,7 +684,7 @@ const module* program::get_main_module() const { return get_module("main"); }
std::vector<const module*> program::get_modules() const
{
const module* mm = get_main_module();
const module* mm = this->get_main_module();
std::vector<const module*> vec_modules;
vec_modules.push_back(mm);
auto sub_modules = mm->get_sub_modules();
......@@ -673,6 +693,17 @@ std::vector<const module*> program::get_modules() const
return vec_modules;
}
std::vector<module*> program::get_modules()
{
module* mm = this->get_main_module();
std::vector<module*> vec_modules;
vec_modules.push_back(mm);
auto sub_modules = mm->get_sub_modules();
vec_modules.insert(vec_modules.end(), sub_modules.begin(), sub_modules.end());
return vec_modules;
}
program& program::sort()
{
for(auto& mod : this->impl->modules)
......
......@@ -37,6 +37,9 @@ struct stream_info
std::unordered_map<instruction_ref, std::size_t> ins2stream;
std::unordered_map<instruction_ref, std::size_t> weights;
std::unordered_map<instruction_ref, std::size_t> iweights;
ins_dep_map mod_implicit_deps;
void calc_implicit_deps(const module& p) { mod_implicit_deps = p.calc_implicit_deps(); }
void accumulate_weights(instruction_ref last, const schedule_model& model)
{
......@@ -51,11 +54,17 @@ struct stream_info
if(op.name() == "@return")
weight = 1;
iweights[ins] = weight;
weights[ins] =
std::accumulate(ins->inputs().begin(),
ins->inputs().end(),
weight,
[&](std::size_t w, instruction_ref i) { return w + self(i); });
auto inputs = ins->inputs();
if(contains(mod_implicit_deps, ins))
{
const auto& impl_deps = mod_implicit_deps.at(ins);
inputs.insert(inputs.end(), impl_deps.begin(), impl_deps.end());
}
weights[ins] = std::accumulate(
inputs.begin(), inputs.end(), weight, [&](std::size_t w, instruction_ref i) {
return w + self(i);
});
}
return weights[ins];
})(last);
......@@ -114,7 +123,9 @@ struct stream_info
assert(ins != p.end());
if(contains(partitions, ins))
return;
assert(p.has_instruction(ins));
if(not p.has_instruction(ins))
return;
// Add an entry so we know the instruction was visited
partitions[ins];
part.add(ins, this->iweights[ins]);
......@@ -208,12 +219,22 @@ struct stream_info
// Pop the first element
auto top = children.begin()->second;
children.erase(children.begin());
p.move_instruction(top, p.begin());
for(auto ins : top->inputs())
{
if(not p.has_instruction(ins))
continue;
add_child(ins);
}
if(contains(mod_implicit_deps, top))
{
for(auto ins : mod_implicit_deps.at(top))
{
assert(p.has_instruction(ins));
add_child(ins);
}
}
}
}
......@@ -346,6 +367,8 @@ struct stream_info
{
for(auto&& arg : ins->outputs())
{
if(not p.has_instruction(arg))
continue;
if(is_merge_point(arg))
merge_from[ins].insert(arg);
merge_from[ins].insert(merge_from[arg].begin(), merge_from[arg].end());
......@@ -469,7 +492,9 @@ void schedule::apply(module& p) const
{
if(not enable)
return;
stream_info si;
si.calc_implicit_deps(p);
auto last = std::prev(p.end());
si.accumulate_weights(last, model);
auto nstreams = si.assign_streams(p, model.concurrency());
......
......@@ -307,7 +307,7 @@ struct cpu_apply
std::size_t index = 0;
for(auto ins : outputs_alias)
{
prog_output_names[ins] = "#output_" + std::to_string(index++);
prog_output_names[ins] = modl->name() + ":#output_" + std::to_string(index++);
}
}
}
......
......@@ -38,6 +38,7 @@ add_library(migraphx_device
device/cos.cpp
device/cosh.cpp
device/div.cpp
device/equal.cpp
device/erf.cpp
device/exp.cpp
device/floor.cpp
......@@ -67,6 +68,7 @@ add_library(migraphx_device
device/reduce_sum.cpp
device/reduce_prod.cpp
device/relu.cpp
device/rnn_variable_seq_lens.cpp
device/round.cpp
device/rsqrt.cpp
device/sigmoid.cpp
......@@ -79,9 +81,7 @@ add_library(migraphx_device
device/sub.cpp
device/tan.cpp
device/tanh.cpp
device/rnn_variable_seq_lens.cpp
device/unary_not.cpp
device/equal.cpp
)
set_target_properties(migraphx_device PROPERTIES EXPORT_NAME device)
rocm_set_soversion(migraphx_device ${MIGRAPHX_SO_VERSION})
......@@ -106,46 +106,46 @@ target_include_directories(migraphx_device PUBLIC $<BUILD_INTERFACE:${CMAKE_CURR
target_include_directories(migraphx_device PRIVATE $<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/device/include>)
add_library(migraphx_gpu
abs.cpp
analyze_streams.cpp
allocation_model.cpp
argmax.cpp
argmin.cpp
batch_norm_inference.cpp
clip.cpp
code_object_op.cpp
compile_hip.cpp
compile_hip_code_object.cpp
concat.cpp
convert.cpp
convolution.cpp
deconvolution.cpp
eliminate_workspace.cpp
elu.cpp
fuse_ops.cpp
gather.cpp
gemm_impl.cpp
hip.cpp
target.cpp
int8_conv_pack.cpp
int8_gemm_pack.cpp
kernel.cpp
lowering.cpp
pooling.cpp
convolution.cpp
deconvolution.cpp
quant_convolution.cpp
softmax.cpp
logsoftmax.cpp
concat.cpp
leaky_relu.cpp
batch_norm_inference.cpp
kernel.cpp
write_literals.cpp
rocblas.cpp
abs.cpp
elu.cpp
pad.cpp
gather.cpp
convert.cpp
lrn.cpp
schedule_model.cpp
leaky_relu.cpp
pack_args.cpp
pack_int8_args.cpp
clip.cpp
int8_gemm_pack.cpp
int8_conv_pack.cpp
gemm_impl.cpp
pad.cpp
pooling.cpp
preallocate_param.cpp
quant_convolution.cpp
rnn_variable_seq_lens.cpp
rocblas.cpp
softmax.cpp
schedule_model.cpp
sync_device.cpp
pack_args.cpp
compile_hip.cpp
target.cpp
write_literals.cpp
)
set_target_properties(migraphx_gpu PROPERTIES EXPORT_NAME gpu)
function(register_migraphx_gpu_ops PREFIX)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment