fix(batching): Avoid theoretical hang in batcher loop (#5)
- Avoid theoretical hang in batcher loop
- Avoid a couple of clones in the router generate method
- Keep attention mask tensors as integers
- Remove num_heads attribute
Co-authored-by:
OlivierDehaene <Olivier.dehaene@gmail.com>
Showing
Please register or sign in to comment