-[📌 Design and Acknowledgement](#-design-and-acknowledgement)
## 📚 Introduction
This module offers a layer of abstraction for ColossalAI. With this module, the user can easily switch between different accelerator backends, such as Nvidia GPUs, Huawei NPUs, etc. This module is an attempt to make users' code portable across different hardware platform with a simple `auto_set_accelerator()` API.
## 📌 Design and Acknowledgement
Our `accelerator` module is heavily inspired by [`deepspeed/accelerator`](https://www.deepspeed.ai/tutorials/accelerator-abstraction-interface/). We found that it is a very well-designed and well-structured module that can be easily integrated into our project. We would like to thank the DeepSpeed team for their great work.
We implemented this accelerator module from scratch. At the same time, we have implemented our own modifications:
1. we updated the accelerator API names to be aligned with PyTorch's native API names.
2. we did not include the `op builder` in the `accelerator`. Instead, we have reconstructed our `kernel` module to automatically match the accelerator and its corresponding kernel implementations, so as to make modules less tangled.
Returns the percent of time over the past sample period during which one or more kernels was executing on the device as given by nvidia-smi or npu-smi, etc.
Sets the random number generator state of all devices.
"""
@abstractmethod
defmanual_seed(self,seed:int)->None:
"""
Sets the seed for generating random numbers for the current device.
"""
@abstractmethod
defmanual_seed_all(self,seed:int)->None:
"""
Sets the seed for generating random numbers on all devices.
"""
@abstractmethod
defseed(self)->None:
"""
Sets the seed for generating random numbers to a random number for the current device.
"""
@abstractmethod
defseed_all(self)->None:
"""
Sets the seed for generating random numbers to a random number on all devices.
"""
@abstractmethod
definitial_seed(self)->int:
"""
Returns the current random seed of the current device.
"""
# =======================
# memory management APIs
# =======================
@abstractmethod
defempty_cache(self)->None:
"""
Releases all unoccupied cached memory currently held by the caching allocator so that those can be used in other device application and visible in nvidia-smi.
"""
@abstractmethod
defmemory_stats(self,device=None)->Dict[str,Any]:
"""
Returns a dictionary of CUDA memory allocator statistics for a given device.
device events are synchronization markers that can be used to monitor the device's progress, to accurately measure timing, and to synchronize CUDA streams.
"""
@abstractmethod
defcurrent_stream(self,device=None):
"""
Returns the currently selected Stream for a given device.
"""
@abstractmethod
defdefault_stream(self,device=None):
"""
Returns the default Stream for a given device.
"""
@abstractmethod
defset_stream(self,stream_):
"""
Sets the current stream.This is a wrapper API to set the stream.
"""
@abstractmethod
defstream(self,stream_):
"""
Wrapper around the Context-manager StreamContext that selects a given stream.
Sets the random number generator state of all devices.
"""
raiseRuntimeError("this method is not supported for cpu accelerator")
defmanual_seed(self,seed:int)->None:
"""
Sets the seed for generating random numbers for the current GPU.
"""
raiseRuntimeError("this method is not supported for cpu accelerator")
defmanual_seed_all(self,seed:int)->None:
"""
Set the random seed for the all processes.
"""
raiseRuntimeError("this method is not supported for cpu accelerator")
defseed(self)->None:
"""
Sets the seed for generating random numbers to a random number for the current GPU.
"""
raiseRuntimeError("this method is not supported for cpu accelerator")
defseed_all(self)->None:
"""
Sets the seed for generating random numbers to a random number on all GPUs.
"""
raiseRuntimeError("this method is not supported for cpu accelerator")
definitial_seed(self)->int:
"""
Returns the current random seed of the current GPU.
"""
raiseRuntimeError("this method is not supported for cpu accelerator")
# =======================
# memory management APIs
# =======================
defempty_cache(self)->None:
"""
Releases all unoccupied cached memory currently held by the caching allocator so that those can be used in other GPU application and visible in nvidia-smi.
"""
raiseRuntimeError("this method is not supported for cpu accelerator")
defmemory_stats(self,device=None)->Dict[str,Any]:
"""
Returns a dictionary of CUDA memory allocator statistics for a given device.
"""
raiseRuntimeError("this method is not supported for cpu accelerator")
CUDA events are synchronization markers that can be used to monitor the device's progress, to accurately measure timing, and to synchronize CUDA streams.
"""
raiseRuntimeError("this method is not supported for cpu accelerator")
defcurrent_stream(self,device=None):
"""
Returns the currently selected Stream for a given device.
"""
raiseRuntimeError("this method is not supported for cpu accelerator")
defdefault_stream(self,device=None):
"""
Returns the default Stream for a given device.
"""
raiseRuntimeError("this method is not supported for cpu accelerator")
defset_stream(self,stream_):
"""
Sets the current stream.This is a wrapper API to set the stream.
"""
raiseRuntimeError("this method is not supported for cpu accelerator")
defstream(self,stream_):
"""
Wrapper around the Context-manager StreamContext that selects a given stream.
"""
raiseRuntimeError("this method is not supported for cpu accelerator")
Sets the random number generator state of all devices.
"""
torch.cuda.set_rng_state_all(new_states)
defmanual_seed(self,seed:int)->None:
"""
Sets the seed for generating random numbers for the current GPU.
"""
torch.cuda.manual_seed(seed)
defmanual_seed_all(self,seed:int)->None:
"""
Set the random seed for the all processes.
"""
torch.cuda.manual_seed_all(seed)
defseed(self)->None:
"""
Sets the seed for generating random numbers to a random number for the current GPU.
"""
torch.cuda.seed()
defseed_all(self)->None:
"""
Sets the seed for generating random numbers to a random number on all GPUs.
"""
torch.cuda.seed_all()
definitial_seed(self)->int:
"""
Returns the current random seed of the current GPU.
"""
returntorch.cuda.initial_seed()
# =======================
# memory management APIs
# =======================
defempty_cache(self)->None:
"""
Releases all unoccupied cached memory currently held by the caching allocator so that those can be used in other GPU application and visible in nvidia-smi.
"""
torch.cuda.empty_cache()
defmemory_stats(self,device=None)->Dict[str,Any]:
"""
Returns a dictionary of CUDA memory allocator statistics for a given device.
CUDA events are synchronization markers that can be used to monitor the device's progress, to accurately measure timing, and to synchronize CUDA streams.
Sets the random number generator state of all devices.
"""
torch.npu.set_rng_state_all(new_states)
defmanual_seed(self,seed:int)->None:
"""
Sets the seed for generating random numbers for the current GPU.
"""
torch.npu.manual_seed(seed)
defmanual_seed_all(self,seed:int)->None:
"""
Set the random seed for the all processes.
"""
torch.npu.manual_seed_all(seed)
defseed(self)->None:
"""
Sets the seed for generating random numbers to a random number for the current GPU.
"""
torch.npu.seed()
defseed_all(self)->None:
"""
Sets the seed for generating random numbers to a random number on all GPUs.
"""
torch.npu.seed_all()
definitial_seed(self)->int:
"""
Returns the current random seed of the current GPU.
"""
returntorch.npu.initial_seed()
# =======================
# memory management APIs
# =======================
defempty_cache(self)->None:
"""
Releases all unoccupied cached memory currently held by the caching allocator so that those can be used in other GPU application and visible in nvidia-smi.
"""
torch.npu.empty_cache()
defmemory_stats(self,device=None)->Dict[str,Any]:
"""
Returns a dictionary of npu memory allocator statistics for a given device.
npu events are synchronization markers that can be used to monitor the device's progress, to accurately measure timing, and to synchronize npu streams.