[Pipeline Middleware] Reduce comm redundancy by getting accurate output (#2232)
* move to cpu to avoid dead lock
* get output by offsets
Co-authored-by:
Ziyue Jiang <ziyue.jiang@gmail.com>
Showing
Please register or sign in to comment