• Jiezhong Qiu's avatar
    customized topk · 1feaaf0c
    Jiezhong Qiu authored
    * when k=1, it reduces to torch.max, and not surprising that torch.max is
    faster than torch.topk.
    * however when k=2, it is even slower than torch.topk
    1feaaf0c
mem_transformer.py 43.7 KB