Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

synchronization target usage in services which are not always run as a part of the boot or shutdown #21

Open
geissonator opened this issue May 27, 2022 · 2 comments

Comments

@geissonator
Copy link
Contributor

Ran into an architecture'ish type problem recently. There is a service, that is a part of the power off, but is also a part of the host quiesce. This service utlizes a power off synchronization target:

Wants=obmc-host-stop-pre@%i.target
Before=obmc-host-stop-pre@%i.targe

So say you power on your system, and just run this service on it's own. You actually end up starting all of the synchronization targets!

May 27 19:55:24 p10bmc systemd[1]: Reached target Stop Host0 (Pre).
May 27 19:55:24 p10bmc systemd[1]: Reached target Host0 (Stopping).
May 27 19:55:24 p10bmc systemd[1]: Reached target Host0 (Stopped).

And then, if you issue an actual power off, all of the synchronization targets have started so you have no synchronization!

The real world example we (IBM) hit this on was with the [email protected]. This service is called when the host goes to the Quiesce state. After Quiesce, the host-state-manager initiates a reboot of the host firmware, which starts the power off process. That power off process no longer has the synchronization targets to coordinate the power down.

I see a few potential solutions:

  • Don't allow a service that is not run in the power on or power off targets to use synchronization targets (make them explicitly set their service dependencies) (probably the best short term solution)
  • Ensure that when the power on or off targets are started, they stop all of the synchronization targets (not real sure how we'd do this, some service that runs first?)
  • Get rid of synchronization targets and just do explicit service to service dependencies
@geissonator
Copy link
Contributor Author

Services that cause the above issue (on IBM systems):

[email protected]
[email protected]
pldmSoftPowerOff.service

@geissonator
Copy link
Contributor Author

https://gerrit.openbmc.org/c/openbmc/phosphor-state-manager/+/40026 was a commit to ensure the synchronization targets are started as a part of the primary power on and off targets. So with that change in, removing the Wants/Requires from the above services should be fine.

geissonator added a commit to openbmc/openbmc that referenced this issue Jul 18, 2022
openbmc/phosphor-state-manager#21 highlights an architecture issue with
OpenBMC's use of synchronization targets. When a service, such as
[email protected], runs both in a standard power off target,
as well as in other paths (like the host quiesce path), there is an
issue.

The service starts the synchronization targets in the quiesce path and
this causes them to already be running on the power off, resulting in
the synchronization targets not actually coordinating the power off.

The direction this commit takes OpenBMC is that if a service needs to
run outside of the standard power on or off path, then they can not
have a Wants or Requires clause in the service file.

The following commit was done a while back to address this issue:
  https://gerrit.openbmc.org/c/openbmc/phosphor-state-manager/+/40026

That is that we ensure the primary power on and off targets start the
synchronization targets so services requiring them can just use a
Before or After clause.

The piece that was never done was to go and fix the services which fell
into this bucket.

Add an explicit dependency on the stop-instructions service to ensure
that this service is always run before it when they are both started
at the same time. This just provides an extra level of protection to
ensure we never stop host instructions before disabling occ
monitoring.

Tested:
- Did multiple boots, reboots, and host crash tests and saw no issues

Signed-off-by: Andrew Geissler <[email protected]>
Change-Id: I6c8c32a605216c0c3dc2065f7c09236d2c216720
geissonator added a commit to openbmc/openpower-proc-control that referenced this issue Jul 19, 2022
openbmc/phosphor-state-manager#21 highlights an architecture issue with
OpenBMC's use of synchronization targets. When a service, such as
[email protected], runs both in a standard power off
target, as well as in other paths (like the host quiesce path), there
is an issue.

The service starts the synchronization targets in the quiesce path and
this causes them to already be running on the power off, resulting in
the synchronization targets not actually coordinating the power off.

The direction this commit takes OpenBMC is that if a service needs to
run outside of the standard power on or off path, then they can not
have a Wants or Requires clause in the service file.

The following commit was done a while back to address this issue:
  https://gerrit.openbmc.org/c/openbmc/phosphor-state-manager/+/40026

That is that we ensure the primary power on and off targets start the
synchronization targets so services requiring them can just use a
Before or After clause.

The piece that was never done was to go and fix the services which fell
into this bucket.

Tested:
- Did multiple boots, reboots, and host crash tests and saw no issues

Signed-off-by: Andrew Geissler <[email protected]>
Change-Id: Ida68d83e2c2c18484eb4f28bc55c91fa5ff94930
bradbishop pushed a commit to openbmc/pldm that referenced this issue Jul 20, 2022
openbmc/phosphor-state-manager#21 highlights an architecture issue with
OpenBMC's use of synchronization targets. When a service, such as
pldmSoftPowerOff.service, runs both in a standard power off
target, as well as in other paths (like the host graceful quiesce
path), there is an issue.

The service starts the synchronization targets in the quiesce path and
this causes them to already be running on the power off, resulting in
the synchronization targets not actually coordinating the power off.

The direction this commit takes OpenBMC is that if a service needs to
run outside of the standard power on or off path, then they can not
have a Wants or Requires clause in the service file.

The following commit was done a while back to address this issue:
  https://gerrit.openbmc.org/c/openbmc/phosphor-state-manager/+/40026

That is that we ensure the primary power on and off targets start the
synchronization targets so services requiring them can just use a
Before or After clause.

The piece that was never done was to go and fix the services which fell
into this bucket.

Tested:
- Did multiple boots, reboots, and host crash tests and saw no issues

Signed-off-by: Andrew Geissler <[email protected]>
Change-Id: I7260f4aad666acf127f9766cf27dd54f4a18ebe4
geissonator added a commit to geissonator/pldm that referenced this issue Jul 20, 2022
openbmc/phosphor-state-manager#21 highlights an architecture issue with
OpenBMC's use of synchronization targets. When a service, such as
pldmSoftPowerOff.service, runs both in a standard power off
target, as well as in other paths (like the host graceful quiesce
path), there is an issue.

The service starts the synchronization targets in the quiesce path and
this causes them to already be running on the power off, resulting in
the synchronization targets not actually coordinating the power off.

The direction this commit takes OpenBMC is that if a service needs to
run outside of the standard power on or off path, then they can not
have a Wants or Requires clause in the service file.

The following commit was done a while back to address this issue:
  https://gerrit.openbmc.org/c/openbmc/phosphor-state-manager/+/40026

That is that we ensure the primary power on and off targets start the
synchronization targets so services requiring them can just use a
Before or After clause.

The piece that was never done was to go and fix the services which fell
into this bucket.

Tested:
- Did multiple boots, reboots, and host crash tests and saw no issues

Signed-off-by: Andrew Geissler <[email protected]>
Change-Id: I7260f4aad666acf127f9766cf27dd54f4a18ebe4
geissonator added a commit to geissonator/openpower-proc-control that referenced this issue Jul 20, 2022
openbmc/phosphor-state-manager#21 highlights an architecture issue with
OpenBMC's use of synchronization targets. When a service, such as
[email protected], runs both in a standard power off
target, as well as in other paths (like the host quiesce path), there
is an issue.

The service starts the synchronization targets in the quiesce path and
this causes them to already be running on the power off, resulting in
the synchronization targets not actually coordinating the power off.

The direction this commit takes OpenBMC is that if a service needs to
run outside of the standard power on or off path, then they can not
have a Wants or Requires clause in the service file.

The following commit was done a while back to address this issue:
  https://gerrit.openbmc.org/c/openbmc/phosphor-state-manager/+/40026

That is that we ensure the primary power on and off targets start the
synchronization targets so services requiring them can just use a
Before or After clause.

The piece that was never done was to go and fix the services which fell
into this bucket.

Tested:
- Did multiple boots, reboots, and host crash tests and saw no issues

Signed-off-by: Andrew Geissler <[email protected]>
Change-Id: Ida68d83e2c2c18484eb4f28bc55c91fa5ff94930
rfrandse pushed a commit to ibm-openbmc/openpower-proc-control that referenced this issue Jul 21, 2022
openbmc/phosphor-state-manager#21 highlights an architecture issue with
OpenBMC's use of synchronization targets. When a service, such as
[email protected], runs both in a standard power off
target, as well as in other paths (like the host quiesce path), there
is an issue.

The service starts the synchronization targets in the quiesce path and
this causes them to already be running on the power off, resulting in
the synchronization targets not actually coordinating the power off.

The direction this commit takes OpenBMC is that if a service needs to
run outside of the standard power on or off path, then they can not
have a Wants or Requires clause in the service file.

The following commit was done a while back to address this issue:
  https://gerrit.openbmc.org/c/openbmc/phosphor-state-manager/+/40026

That is that we ensure the primary power on and off targets start the
synchronization targets so services requiring them can just use a
Before or After clause.

The piece that was never done was to go and fix the services which fell
into this bucket.

Tested:
- Did multiple boots, reboots, and host crash tests and saw no issues

Signed-off-by: Andrew Geissler <[email protected]>
Change-Id: Ida68d83e2c2c18484eb4f28bc55c91fa5ff94930
rfrandse pushed a commit to ibm-openbmc/pldm that referenced this issue Jul 22, 2022
openbmc/phosphor-state-manager#21 highlights an architecture issue with
OpenBMC's use of synchronization targets. When a service, such as
pldmSoftPowerOff.service, runs both in a standard power off
target, as well as in other paths (like the host graceful quiesce
path), there is an issue.

The service starts the synchronization targets in the quiesce path and
this causes them to already be running on the power off, resulting in
the synchronization targets not actually coordinating the power off.

The direction this commit takes OpenBMC is that if a service needs to
run outside of the standard power on or off path, then they can not
have a Wants or Requires clause in the service file.

The following commit was done a while back to address this issue:
  https://gerrit.openbmc.org/c/openbmc/phosphor-state-manager/+/40026

That is that we ensure the primary power on and off targets start the
synchronization targets so services requiring them can just use a
Before or After clause.

The piece that was never done was to go and fix the services which fell
into this bucket.

Tested:
- Did multiple boots, reboots, and host crash tests and saw no issues

Signed-off-by: Andrew Geissler <[email protected]>
Change-Id: I7260f4aad666acf127f9766cf27dd54f4a18ebe4
rfrandse pushed a commit to ibm-openbmc/pldm that referenced this issue Jul 22, 2022
openbmc/phosphor-state-manager#21 highlights an architecture issue with
OpenBMC's use of synchronization targets. When a service, such as
pldmSoftPowerOff.service, runs both in a standard power off
target, as well as in other paths (like the host graceful quiesce
path), there is an issue.

The service starts the synchronization targets in the quiesce path and
this causes them to already be running on the power off, resulting in
the synchronization targets not actually coordinating the power off.

The direction this commit takes OpenBMC is that if a service needs to
run outside of the standard power on or off path, then they can not
have a Wants or Requires clause in the service file.

The following commit was done a while back to address this issue:
  https://gerrit.openbmc.org/c/openbmc/phosphor-state-manager/+/40026

That is that we ensure the primary power on and off targets start the
synchronization targets so services requiring them can just use a
Before or After clause.

The piece that was never done was to go and fix the services which fell
into this bucket.

Tested:
- Did multiple boots, reboots, and host crash tests and saw no issues

Signed-off-by: Andrew Geissler <[email protected]>
Change-Id: I7260f4aad666acf127f9766cf27dd54f4a18ebe4
rfrandse pushed a commit to ibm-openbmc/openbmc that referenced this issue Jul 25, 2022
openbmc/phosphor-state-manager#21 highlights an architecture issue with
OpenBMC's use of synchronization targets. When a service, such as
[email protected], runs both in a standard power off target,
as well as in other paths (like the host quiesce path), there is an
issue.

The service starts the synchronization targets in the quiesce path and
this causes them to already be running on the power off, resulting in
the synchronization targets not actually coordinating the power off.

The direction this commit takes OpenBMC is that if a service needs to
run outside of the standard power on or off path, then they can not
have a Wants or Requires clause in the service file.

The following commit was done a while back to address this issue:
  https://gerrit.openbmc.org/c/openbmc/phosphor-state-manager/+/40026

That is that we ensure the primary power on and off targets start the
synchronization targets so services requiring them can just use a
Before or After clause.

The piece that was never done was to go and fix the services which fell
into this bucket.

Add an explicit dependency on the stop-instructions service to ensure
that this service is always run before it when they are both started
at the same time. This just provides an extra level of protection to
ensure we never stop host instructions before disabling occ
monitoring.

Tested:
- Did multiple boots, reboots, and host crash tests and saw no issues

Signed-off-by: Andrew Geissler <[email protected]>
Change-Id: I6c8c32a605216c0c3dc2065f7c09236d2c216720
rfrandse pushed a commit to ibm-openbmc/pldm that referenced this issue Sep 12, 2022
openbmc/phosphor-state-manager#21 highlights an architecture issue with
OpenBMC's use of synchronization targets. When a service, such as
pldmSoftPowerOff.service, runs both in a standard power off
target, as well as in other paths (like the host graceful quiesce
path), there is an issue.

The service starts the synchronization targets in the quiesce path and
this causes them to already be running on the power off, resulting in
the synchronization targets not actually coordinating the power off.

The direction this commit takes OpenBMC is that if a service needs to
run outside of the standard power on or off path, then they can not
have a Wants or Requires clause in the service file.

The following commit was done a while back to address this issue:
  https://gerrit.openbmc.org/c/openbmc/phosphor-state-manager/+/40026

That is that we ensure the primary power on and off targets start the
synchronization targets so services requiring them can just use a
Before or After clause.

The piece that was never done was to go and fix the services which fell
into this bucket.

Tested:
- Did multiple boots, reboots, and host crash tests and saw no issues

Signed-off-by: Andrew Geissler <[email protected]>
Change-Id: I7260f4aad666acf127f9766cf27dd54f4a18ebe4
rfrandse pushed a commit to ibm-openbmc/openbmc that referenced this issue Oct 25, 2022
openbmc/phosphor-state-manager#21 highlights an architecture issue with
OpenBMC's use of synchronization targets. When a service, such as
[email protected], runs both in a standard power off target,
as well as in other paths (like the host quiesce path), there is an
issue.

The service starts the synchronization targets in the quiesce path and
this causes them to already be running on the power off, resulting in
the synchronization targets not actually coordinating the power off.

The direction this commit takes OpenBMC is that if a service needs to
run outside of the standard power on or off path, then they can not
have a Wants or Requires clause in the service file.

The following commit was done a while back to address this issue:
  https://gerrit.openbmc.org/c/openbmc/phosphor-state-manager/+/40026

That is that we ensure the primary power on and off targets start the
synchronization targets so services requiring them can just use a
Before or After clause.

The piece that was never done was to go and fix the services which fell
into this bucket.

Add an explicit dependency on the stop-instructions service to ensure
that this service is always run before it when they are both started
at the same time. This just provides an extra level of protection to
ensure we never stop host instructions before disabling occ
monitoring.

Tested:
- Did multiple boots, reboots, and host crash tests and saw no issues

Signed-off-by: Andrew Geissler <[email protected]>
Change-Id: I6c8c32a605216c0c3dc2065f7c09236d2c216720
rfrandse pushed a commit to ibm-openbmc/openbmc that referenced this issue Oct 26, 2022
openbmc/phosphor-state-manager#21 highlights an architecture issue with
OpenBMC's use of synchronization targets. When a service, such as
[email protected], runs both in a standard power off target,
as well as in other paths (like the host quiesce path), there is an
issue.

The service starts the synchronization targets in the quiesce path and
this causes them to already be running on the power off, resulting in
the synchronization targets not actually coordinating the power off.

The direction this commit takes OpenBMC is that if a service needs to
run outside of the standard power on or off path, then they can not
have a Wants or Requires clause in the service file.

The following commit was done a while back to address this issue:
  https://gerrit.openbmc.org/c/openbmc/phosphor-state-manager/+/40026

That is that we ensure the primary power on and off targets start the
synchronization targets so services requiring them can just use a
Before or After clause.

The piece that was never done was to go and fix the services which fell
into this bucket.

Add an explicit dependency on the stop-instructions service to ensure
that this service is always run before it when they are both started
at the same time. This just provides an extra level of protection to
ensure we never stop host instructions before disabling occ
monitoring.

Tested:
- Did multiple boots, reboots, and host crash tests and saw no issues

Signed-off-by: Andrew Geissler <[email protected]>
Change-Id: I6c8c32a605216c0c3dc2065f7c09236d2c216720
spinler pushed a commit to spinler/openbmc that referenced this issue Nov 15, 2022
openbmc/phosphor-state-manager#21 highlights an architecture issue with
OpenBMC's use of synchronization targets. When a service, such as
[email protected], runs both in a standard power off target,
as well as in other paths (like the host quiesce path), there is an
issue.

The service starts the synchronization targets in the quiesce path and
this causes them to already be running on the power off, resulting in
the synchronization targets not actually coordinating the power off.

The direction this commit takes OpenBMC is that if a service needs to
run outside of the standard power on or off path, then they can not
have a Wants or Requires clause in the service file.

The following commit was done a while back to address this issue:
  https://gerrit.openbmc.org/c/openbmc/phosphor-state-manager/+/40026

That is that we ensure the primary power on and off targets start the
synchronization targets so services requiring them can just use a
Before or After clause.

The piece that was never done was to go and fix the services which fell
into this bucket.

Add an explicit dependency on the stop-instructions service to ensure
that this service is always run before it when they are both started
at the same time. This just provides an extra level of protection to
ensure we never stop host instructions before disabling occ
monitoring.

Tested:
- Did multiple boots, reboots, and host crash tests and saw no issues

Signed-off-by: Andrew Geissler <[email protected]>
Change-Id: I6c8c32a605216c0c3dc2065f7c09236d2c216720
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant