Commit ec46ed52 authored by Graham King's avatar Graham King Committed by GitHub
Browse files

fix(dynamo-run): Text input doesn't need a name (#80)

For the `echo` and `pystr` engines we previously required the user to pass `--model-name <x>` so we would have a name for the model. If the input is HTTP we do need this to match on the users' JSON request.

If the input is Text we don't need a name. So if the input is Text and we don't already have a name for the model, give it one.
parent c8b70289
......@@ -170,12 +170,12 @@ Build: `cargo build --release --features python`
If the Python engine wants to receive and returns strings - it will do the prompt templating and tokenization itself - run it like this:
```
dynamo-run out=pystr:/home/user/my_python_engine.py --name <model-name>
dynamo-run out=pystr:/home/user/my_python_engine.py
```
- The `request` parameter is a map, an OpenAI compatible create chat completion request: https://platform.openai.com/docs/api-reference/chat/create
- The function must `yield` a series of maps conforming to create chat completion stream response (example below).
- The `--name` flag is the name we serve the model under, if using an HTTP front end.
- If using an HTTP front-end add the `--model-name` flag. This is the name we serve the model under.
The file is loaded once at startup and kept in memory.
......@@ -218,7 +218,7 @@ dynamo-run out=pytok:/home/user/my_python_engine.py --model-path <hf-repo-checko
{"token_ids":[791],"tokens":None,"text":None,"cum_log_probs":None,"log_probs":None,"finish_reason":None}
```
- Command like flag `--model-path` which must point to a Hugging Face repo checkout containing the `tokenizer.json`. The `--name` flag is optional. If not provided we use the HF repo name (directory name) as the model name.
- Command like flag `--model-path` which must point to a Hugging Face repo checkout containing the `tokenizer.json`. The `--model-name` flag is optional. If not provided we use the HF repo name (directory name) as the model name.
Example engine:
```
......
......@@ -36,6 +36,10 @@ pub use opt::{Input, Output};
/// concatenations.
const ENDPOINT_SCHEME: &str = "dyn://";
/// When `in=text` the user doesn't need to know the model name, and doesn't need to provide it on
/// the command line. Hence it's optional, and defaults to this.
const INVISIBLE_MODEL_NAME: &str = "dynamo-run";
/// How we identify a python string endpoint
#[cfg(feature = "python")]
const PYTHON_STR_SCHEME: &str = "pystr:";
......@@ -81,12 +85,21 @@ pub async fn run(
.or(flags.model_path_flag)
.and_then(|p| p.canonicalize().ok());
// Serve the model under the name provided, or the name of the GGUF file or HF repo.
let model_name = flags.model_name.or_else(|| {
model_path
.as_ref()
.and_then(|p| p.iter().last())
.map(|n| n.to_string_lossy().into_owned())
});
let model_name = flags
.model_name
.or_else(|| {
model_path
.as_ref()
.and_then(|p| p.iter().last())
.map(|n| n.to_string_lossy().into_owned())
})
.or_else(|| {
if in_opt == Input::Text {
Some(INVISIBLE_MODEL_NAME.to_string())
} else {
None
}
});
// Load the model deployment card, if any
// Only used by some engines, so without those feature flags it's unused.
#[allow(unused_variables)]
......
......@@ -17,6 +17,7 @@ use std::fmt;
use crate::ENDPOINT_SCHEME;
#[derive(PartialEq)]
pub enum Input {
/// Run an OpenAI compatible HTTP server
Http,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment