-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
start agent when visiting the simulations list #7319
Comments
I'll handle this |
@moellep I want to confirm you are only seeing this after the agent has been idle for too long and killed? If I restart the server and visit the sim list then I see an agent is started. If I do that and then wait for the agent to be killed after being idle for too long and then refresh the sim list then no agent is started. |
Looking at the code I realize now that a new agent isn't started because of the interval for sending beginSession requests. In the default config that is 5m and the idle_check_secs is 1800s. So assuming a default config (which we are running in prod) then (I think) those two number should play well together. For testing I reduced idle_check_secs to 10s and _REFRESH_SESSION to 5s, started an agent by visiting the sim list, waited for the agent to be killed, refreshed, and a new agent was started. @moellep can you help with steps to reproduce the problem? |
Ah - I think I'm seeing something different. Yes, I see log messages that the agent has started, and when I visit a SRW simulation it is immediately available, so this is not a regression. I'm seeing an initial delay in apps like openmc where the initial page (geometry in this case) waits for the serverStatus reply until it renders the page. The initial call to serverStatus (after a dev restart, or after the agent has been stopped on prod) takes around 10 seconds, but only the first visit. If I visit a different sim, it replies very quickly. |
Ok, that makes sense. First visit we have no option but to wait for the agent to start again if it has been killed (idle timeout or server restart) I think this can be closed but lmk if there is still something to be solved. |
I was hoping we could somehow preload the 10 seconds delay for the serverStatus. Something is getting initialized there which isn't related to starting the agent (maybe the fastCGI?). I only see the delay when visiting the first sim, so maybe that initialization could be included with beginSession. |
I have a fix for this in 6784-fastcgi-improve. There are multiple problems. |
Thanks Rob. Let's talk about this when you're back and I can look into finishing the work on your branch. From my read of the problem: The problem is, as @moellep expected, waiting on fastcgi. openmc does a statefulCompute before initSimulation. That statefulcompute depends on fastcgi which must be started and replied to before we can get the status back. All of the round trips and starting fastcgi take time. I think we could pre start fastcgi as part of beginSession. But, I need to think about that a little harder before I'm convinced. Probably makes sense to fold it into Rob's work. |
This is a regression of #4230 - the agent is no longer getting prestarted when someone visits the simulation list. This causes a 10 second delay when opening a simulation while waiting for the agent to initialize. See also #7318
The text was updated successfully, but these errors were encountered: