feat: adds phi model (#1442)
This PR adds basic modeling for phi-2
run
```bash
text-generation-server \
serve \
microsoft/phi-2 \
--revision 834565c23f9b28b96ccbeabe614dd906b6db551a
```
test
```bash
curl -s localhost:3000/generate \
-X POST \
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' \
-H 'Content-Type: application/json' | jq .
# {
# "generated_text": "\nDeep learning is a subset of machine learning that uses artificial neural networks to learn from data. These"
# }
```
notes
- recently (~1 day ago) the Phi weights and model were updated to
accommodate adding [GQA/MQA attention to the
model.](https://github.com/huggingface/transformers/pull/28163) This
impl expects the original model format so a fixed revision is required
at the moment.
- this PR only includes a basic implementation of the model and can
later be extended for support Flash and Sharded versions as well as make
use of better optimization
Showing
Please register or sign in to comment