"vscode:/vscode.git/clone" did not exist on "6df43da0a4f2721c12f0a5636526bb6829455565"
  • drbh's avatar
    feat: adds phi model (#1442) · 7e2a7433
    drbh authored
    This PR adds basic modeling for phi-2 
    
    run
    ```bash
    text-generation-server \
        serve \
        microsoft/phi-2 \
        --revision 834565c23f9b28b96ccbeabe614dd906b6db551a
    ```
    
    
    test
    ```bash
    curl -s localhost:3000/generate \
       -X POST \
       -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":20}}' \
       -H 'Content-Type: application/json' | jq .
    # {
    #   "generated_text": "\nDeep learning is a subset of machine learning that uses artificial neural networks to learn from data. These"
    # }
    ```
    
    
    
    notes 
    - recently (~1 day ago) the Phi weights and model were updated to
    accommodate adding [GQA/MQA attention to the
    model.](https://github.com/huggingface/transformers/pull/28163) This
    impl expects the original model format so a fixed revision is required
    at the moment.
    - this PR only includes a basic implementation of the model and can
    later be extended for support Flash and Sharded versions as well as make
    use of better optimization
    7e2a7433
flash_phi.py 3.55 KB