In router mode, when --sleep-idle-seconds triggers, the child subprocess unloads the model from VRAM but the process remains alive and attached to the GPU, consuming ~600MiB per idle subprocess: # ...