Commits · efd602c8207c90908fc4cd127235f1b57e741814 · OpenDAS / text-generation-inference

29 Oct, 2024 1 commit
- last · efd602c8
  xuxzh1 authored Oct 29, 2024
  
  efd602c8
03 Jul, 2023 1 commit

feat(server): Add Non flash MPT. (#514) · 1da07e85

Nicolas Patry authored Jul 03, 2023

# What does this PR do?


This adds a non flash version of MPT.
Flash is harder because we need to create a bias ready cuda kernel of
flash attention.

Fixes
https://github.com/huggingface/text-generation-inference/issues/361
Fixes
https://github.com/huggingface/text-generation-inference/issues/491
Fixes
https://github.com/huggingface/text-generation-inference/issues/290

1da07e85