Commit ec46ed52 authored by Graham King's avatar Graham King Committed by GitHub
Browse files

fix(dynamo-run): Text input doesn't need a name (#80)

For the `echo` and `pystr` engines we previously required the user to pass `--model-name <x>` so we would have a name for the model. If the input is HTTP we do need this to match on the users' JSON request.

If the input is Text we don't need a name. So if the input is Text and we don't already have a name for the model, give it one.
parent c8b70289
...@@ -170,12 +170,12 @@ Build: `cargo build --release --features python` ...@@ -170,12 +170,12 @@ Build: `cargo build --release --features python`
If the Python engine wants to receive and returns strings - it will do the prompt templating and tokenization itself - run it like this: If the Python engine wants to receive and returns strings - it will do the prompt templating and tokenization itself - run it like this:
``` ```
dynamo-run out=pystr:/home/user/my_python_engine.py --name <model-name> dynamo-run out=pystr:/home/user/my_python_engine.py
``` ```
- The `request` parameter is a map, an OpenAI compatible create chat completion request: https://platform.openai.com/docs/api-reference/chat/create - The `request` parameter is a map, an OpenAI compatible create chat completion request: https://platform.openai.com/docs/api-reference/chat/create
- The function must `yield` a series of maps conforming to create chat completion stream response (example below). - The function must `yield` a series of maps conforming to create chat completion stream response (example below).
- The `--name` flag is the name we serve the model under, if using an HTTP front end. - If using an HTTP front-end add the `--model-name` flag. This is the name we serve the model under.
The file is loaded once at startup and kept in memory. The file is loaded once at startup and kept in memory.
...@@ -218,7 +218,7 @@ dynamo-run out=pytok:/home/user/my_python_engine.py --model-path <hf-repo-checko ...@@ -218,7 +218,7 @@ dynamo-run out=pytok:/home/user/my_python_engine.py --model-path <hf-repo-checko
{"token_ids":[791],"tokens":None,"text":None,"cum_log_probs":None,"log_probs":None,"finish_reason":None} {"token_ids":[791],"tokens":None,"text":None,"cum_log_probs":None,"log_probs":None,"finish_reason":None}
``` ```
- Command like flag `--model-path` which must point to a Hugging Face repo checkout containing the `tokenizer.json`. The `--name` flag is optional. If not provided we use the HF repo name (directory name) as the model name. - Command like flag `--model-path` which must point to a Hugging Face repo checkout containing the `tokenizer.json`. The `--model-name` flag is optional. If not provided we use the HF repo name (directory name) as the model name.
Example engine: Example engine:
``` ```
......
...@@ -36,6 +36,10 @@ pub use opt::{Input, Output}; ...@@ -36,6 +36,10 @@ pub use opt::{Input, Output};
/// concatenations. /// concatenations.
const ENDPOINT_SCHEME: &str = "dyn://"; const ENDPOINT_SCHEME: &str = "dyn://";
/// When `in=text` the user doesn't need to know the model name, and doesn't need to provide it on
/// the command line. Hence it's optional, and defaults to this.
const INVISIBLE_MODEL_NAME: &str = "dynamo-run";
/// How we identify a python string endpoint /// How we identify a python string endpoint
#[cfg(feature = "python")] #[cfg(feature = "python")]
const PYTHON_STR_SCHEME: &str = "pystr:"; const PYTHON_STR_SCHEME: &str = "pystr:";
...@@ -81,12 +85,21 @@ pub async fn run( ...@@ -81,12 +85,21 @@ pub async fn run(
.or(flags.model_path_flag) .or(flags.model_path_flag)
.and_then(|p| p.canonicalize().ok()); .and_then(|p| p.canonicalize().ok());
// Serve the model under the name provided, or the name of the GGUF file or HF repo. // Serve the model under the name provided, or the name of the GGUF file or HF repo.
let model_name = flags.model_name.or_else(|| { let model_name = flags
model_path .model_name
.as_ref() .or_else(|| {
.and_then(|p| p.iter().last()) model_path
.map(|n| n.to_string_lossy().into_owned()) .as_ref()
}); .and_then(|p| p.iter().last())
.map(|n| n.to_string_lossy().into_owned())
})
.or_else(|| {
if in_opt == Input::Text {
Some(INVISIBLE_MODEL_NAME.to_string())
} else {
None
}
});
// Load the model deployment card, if any // Load the model deployment card, if any
// Only used by some engines, so without those feature flags it's unused. // Only used by some engines, so without those feature flags it's unused.
#[allow(unused_variables)] #[allow(unused_variables)]
......
...@@ -17,6 +17,7 @@ use std::fmt; ...@@ -17,6 +17,7 @@ use std::fmt;
use crate::ENDPOINT_SCHEME; use crate::ENDPOINT_SCHEME;
#[derive(PartialEq)]
pub enum Input { pub enum Input {
/// Run an OpenAI compatible HTTP server /// Run an OpenAI compatible HTTP server
Http, Http,
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment