• Blake Mizerany's avatar
    cmd: defer stating model info until necessary (#5248) · 2aa91a93
    Blake Mizerany authored
    This commit changes the 'ollama run' command to defer fetching model
    information until it really needs it. That is, when in interactive mode.
    
    It also removes one such case where the model information is fetch in
    duplicate, just before calling generateInteractive and then again, first
    thing, in generateInteractive.
    
    This positively impacts the performance of the command:
    
        ; time ./before run llama3 'hi'
        Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?
    
        ./before run llama3 'hi'  0.02s user 0.01s system 2% cpu 1.168 total
        ; time ./before run llama3 'hi'
        Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?
    
        ./before run llama3 'hi'  0.02s user 0.01s system 2% cpu 1.220 total
        ; time ./before run llama3 'hi'
        Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?
    
        ./before run llama3 'hi'  0.02s user 0.01s system 2% cpu 1.217 total
        ; time ./after run llama3 'hi'
        Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?
    
        ./after run llama3 'hi'  0.02s user 0.01s system 4% cpu 0.652 total
        ; time ./after run llama3 'hi'
        Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?
    
        ./after run llama3 'hi'  0.01s user 0.01s system 5% cpu 0.498 total
        ; time ./after run llama3 'hi'
        Hi! It's nice to meet you. Is there something I can help you with or would you like to chat?
    
        ./after run llama3 'hi'  0.01s user 0.01s system 3% cpu 0.479 total
        ; time ./after run llama3 'hi'
        Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?
    
        ./after run llama3 'hi'  0.02s user 0.01s system 5% cpu 0.507 total
        ; time ./after run llama3 'hi'
        Hi! It's nice to meet you. Is there something I can help you with, or would you like to chat?
    
        ./after run llama3 'hi'  0.02s user 0.01s system 5% cpu 0.507 total
    2aa91a93
interactive.go 19 KB