-
Haodong Tian authored
* [Bugfix] Configure autotuner specific logger for correct level handling - Previously, logging relied on basicConfig, which configured the root logger. This caused the named autotuner logger to ignore DEBUG messages. - This commit sets up a dedicated logger for autotuner, correctly route DEBUG messages to 'autotuner.log' and INFO+ messages to the console. * [Bugfix] Fix tensor_supply for boolean type - Previously `get_tensor_supply` used `torch.randint(-2, 3)` as a fallback, which caused error when the dtype was `torch.bool`. - This commits adds an `is_boolean` check in `KernelParam` and updates `get_tensor_supply` to specifically use `torch.randint(0, 2)` for boolean dtypes. * [Bugfix] Always regenerate JIT inputs during tuning - Removes the caching for `self.jit_input_tensors` within `AutoTuner`. When different autotuning configurations can alter the required input tensor shapes or other properties, reusing cached inputs from a previous configuration lead to errors or incorrect assessments. - This change ensures that `profiler._get_inputs()` is called unconditionally for each configuration evaluation. Since `_get_inputs` is assumed to be relatively inexpensive, the potential overhead is considered acceptable. * [Example] Update example_blocksparse_gemm for autotuner * Run code formatter * [Feature] Enable custom tensor supply and input caching control in Autotuner - Previously, tensor generation was tied to `supply_type` and input caching behavior across configurations was less explicit/controlled. - This commit introduces a `supply_prog` parameter to allow providing a custom function for generating input tensors, overriding the default mechanism. - Adds a `cache_input_tensors` flag (default True) to control input tensor caching: - If True, tensors are generated once per configuration and reused for repetitions, with a check for potential shape mismatches between configurations. - If False, tensors are regenerated for every configuration trial. - Refactors internal input tensor handling using supplier functions for clarity. - Adds a `check_tensor_list_compatibility` utility for shape comparison. * [Example] Update example_blocksparse_gemm for autotuner * Run code formatter * [Example] Small fix in example_blocksparse_gemm * [Fix] Raise error if autotuning yields no valid configuration92e8d5f4