• Graham King's avatar
    feat: Python bring-your-own-engine with our tokenizer (#47) · 12714d90
    Graham King authored
    Instead of using `out=pystr:<my.py>` we can now do this:
    ```
    dynemo-run out=pytok:/home/graham/my_python_engine.py --model-path <hf-repo-checkout>
    ```
    
    That engine will receive and respond with tokens. Here's an example engine file:
    ```
    import asyncio
    
    async def generate(request):
        yield {"token_ids":[791]}
        await asyncio.sleep(0.1)
        yield {"token_ids":[6864]}
        await asyncio.sleep(0.1)
        yield {"token_ids":[315]}
        await asyncio.sleep(0.1)
        yield {"token_ids":[9822]}
        await asyncio.sleep(0.1)
        yield {"token_ids":[374]}
        await asyncio.sleep(0.1)
        yield {"token_ids":[12366]}
        await asyncio.sleep(0.1)
        yield {"token_ids":[13]}
    ```
    
    Also reduce duplication by making the bindings engine use the llm lib engine.
    12714d90
opt.rs 5.52 KB