• Daniel Hiltgen's avatar
    sched: fix race leading to orphaned runners (#10599) · 5e380c3b
    Daniel Hiltgen authored
    If a model is loading, and the request context is canceled during the load
    by a client closing the connection, and another request is inbound for the
    same model with a different configuration (context size, etc.) thus requiring
    a reload, two unload events can be in flight.  The first shuts down the
    original model load, but the second one caused the loss of the new
    reloading runner reference, thus triggering the leak.
    
    The primary fix is detecting the duplicate unload and ignoring the second
    instance.  The load routine is also hardened to ensure we detect
    clobbering an already present runner and unload it with a warning.
    5e380c3b
server.go 30 KB