Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduler dies when trying to start instance with invalid state #126

Open
Metallion opened this issue Feb 20, 2017 · 0 comments
Open

Scheduler dies when trying to start instance with invalid state #126

Metallion opened this issue Feb 20, 2017 · 0 comments

Comments

@Metallion
Copy link
Contributor

Problem

The following scenario:

  1. Start an LXC instance and wait until running.
  2. Reboot executor machine.

Now the LXC container is stopped but OpenVDC thinks it's running.

  1. Start the instance

Result:

Feb 20 14:38:24 ci openvdc-scheduler[2806]: 2017-02-20 14:38:24 [FATAL] github.com/axsh/openvdc/api/instance_service.go:86 BUGON: Detected un-handled state instance_id=i-0000000000 state=state:RUNNING created_at:<seconds:1487314564 nanos:237858284 >
Feb 20 14:38:24 ci systemd[1]: openvdc-scheduler.service: main process exited, code=exited, status=1/FAILURE
Feb 20 14:38:24 ci systemd[1]: Unit openvdc-scheduler.service entered failed state.
Feb 20 14:38:24 ci systemd[1]: openvdc-scheduler.service failed.

The openvdc-scheduler service dies.

# systemctl status openvdc-scheduler
● openvdc-scheduler.service - OpenVDC scheduler
   Loaded: loaded (/usr/lib/systemd/system/openvdc-scheduler.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Mon 2017-02-20 14:38:24 JST; 6min ago
  Process: 2806 ExecStart=/opt/axsh/openvdc/bin/openvdc-scheduler (code=exited, status=1/FAILURE)
 Main PID: 2806 (code=exited, status=1/FAILURE)

Suggested solution

  • On executor start, OpenVDC should check that all instances are in their expected state. If they are not, they should be brought to the states OpenVDC expects them to be.

  • When start is called on a container OpenVDC thinks is "RUNNING", first check which state the instance is actually in. Then switch it to the correct state and run the start command on that.

  • Make sure that scheduler never dies no matter what state start is called on.

Other suggestions welcome. ^_^

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants