Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
OpenDAS
ColossalAI
Commits
32cb7449
Unverified
Commit
32cb7449
authored
Jan 18, 2024
by
Michelle
Committed by
GitHub
Jan 18, 2024
Browse files
fix auto loading gpt2 tokenizer (#5279)
parent
5d9a0ae7
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
with
13 additions
and
0 deletions
+13
-0
applications/ColossalQA/colossalqa/local/llm.py
applications/ColossalQA/colossalqa/local/llm.py
+13
-0
No files found.
applications/ColossalQA/colossalqa/local/llm.py
View file @
32cb7449
...
@@ -136,6 +136,19 @@ class ColossalLLM(LLM):
...
@@ -136,6 +136,19 @@ class ColossalLLM(LLM):
"""Get the identifying parameters."""
"""Get the identifying parameters."""
return
{
"n"
:
self
.
n
}
return
{
"n"
:
self
.
n
}
def
get_token_ids
(
self
,
text
:
str
)
->
List
[
int
]:
"""Return the ordered ids of the tokens in a text.
Args:
text: The string input to tokenize.
Returns:
A list of ids corresponding to the tokens in the text, in order they occur
in the text.
"""
# use the colossal llm's tokenizer instead of langchain's cached GPT2 tokenizer
return
self
.
api
.
tokenizer
.
encode
(
text
)
class
VllmLLM
(
LLM
):
class
VllmLLM
(
LLM
):
"""
"""
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment