Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
text-generation-inference
Commits
97f7a22f
Unverified
Commit
97f7a22f
authored
Nov 07, 2024
by
Wang, Yi
Committed by
GitHub
Nov 07, 2024
Browse files
add trust_remote_code in tokenizer to fix baichuan issue (#2725)
Signed-off-by:
Wang, Yi A
<
yi.a.wang@intel.com
>
parent
b1f9044d
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
with
12 additions
and
3 deletions
+12
-3
router/src/lib.rs
router/src/lib.rs
+8
-2
router/src/server.rs
router/src/server.rs
+1
-0
router/src/validation.rs
router/src/validation.rs
+3
-1
No files found.
router/src/lib.rs
View file @
97f7a22f
...
...
@@ -27,6 +27,7 @@ pub enum Tokenizer {
Python
{
tokenizer_name
:
String
,
revision
:
Option
<
String
>
,
trust_remote_code
:
bool
,
},
Rust
(
tokenizers
::
Tokenizer
),
}
...
...
@@ -38,15 +39,20 @@ impl<'a> PyTokenizer<'a> {
py
:
Python
<
'a
>
,
tokenizer_name
:
String
,
revision
:
Option
<
String
>
,
trust_remote_code
:
bool
,
)
->
PyResult
<
PyTokenizer
<
'a
>>
{
let
transformers
=
py
.import_bound
(
"transformers"
)
?
;
let
auto
=
transformers
.getattr
(
"AutoTokenizer"
)
?
;
let
from_pretrained
=
auto
.getattr
(
"from_pretrained"
)
?
;
let
args
=
(
tokenizer_name
,);
let
kwargs
=
if
let
Some
(
rev
)
=
&
revision
{
[(
"revision"
,
rev
.to_string
())]
.into_py_dict_bound
(
py
)
[
(
"revision"
,
rev
.to_string
()
.into_py
(
py
)),
(
"trust_remote_code"
,
trust_remote_code
.into_py
(
py
)),
]
.into_py_dict_bound
(
py
)
}
else
{
pyo3
::
types
::
PyDict
::
new
_bound
(
py
)
[(
"trust_remote_code"
,
trust_remote_code
.into_py
(
py
))]
.into_py_dict
_bound
(
py
)
};
let
tokenizer
=
from_pretrained
.call
(
args
,
Some
(
&
kwargs
))
?
;
tracing
::
info!
(
"Loaded a python tokenizer"
);
...
...
router/src/server.rs
View file @
97f7a22f
...
...
@@ -1829,6 +1829,7 @@ pub async fn run(
Tokenizer
::
Python
{
tokenizer_name
:
tokenizer_name
.clone
(),
revision
:
revision
.clone
(),
trust_remote_code
,
}
}
};
...
...
router/src/validation.rs
View file @
97f7a22f
...
...
@@ -439,9 +439,11 @@ fn tokenizer_worker(
Tokenizer
::
Python
{
tokenizer_name
,
revision
,
trust_remote_code
,
}
=>
{
pyo3
::
Python
::
with_gil
(|
py
|
->
pyo3
::
PyResult
<
()
>
{
let
tokenizer
=
PyTokenizer
::
from_py
(
py
,
tokenizer_name
,
revision
)
?
;
let
tokenizer
=
PyTokenizer
::
from_py
(
py
,
tokenizer_name
,
revision
,
trust_remote_code
)
?
;
// Loop over requests
while
let
Some
(((
inputs
,
add_special_tokens
,
truncate
),
response_tx
,
parent_span
))
=
receiver
.blocking_recv
()
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment