diff --git a/docs/liquids/ironic.md b/docs/liquids/ironic.md index c12da0b8..8410a9ed 100644 --- a/docs/liquids/ironic.md +++ b/docs/liquids/ironic.md @@ -40,7 +40,7 @@ If `with_subresources` is set, each `instances_$FLAVOR` resource will have one s | `attributes.tags` | array of strings | User-supplied tags on this instance according to Nova. | | `attributes.os_type` | string | The OS type, as inferred from the image that was used to boot this instance. | -TODO: `os_type` inference is shared with Nova. When the Nova subresource scraping is moved to LIQUID, the method shall be documented over there, and a backreference shall be added here. +The logic for `os_type` inference is shared with liquid-nova, and is explained [in the documentation for liquid-nova](./nova.md#os-type-inference). ### Considerations for cloud operators @@ -49,7 +49,7 @@ You need to make sure that the extra specs on your Ironic flavors are all set up Furthermore, Nova needs to be patched to ignore the usual quotas for instances of Ironic flavors. Instead, Nova must accept quotas with the same naming pattern (`instances_$FLAVOR`), and only enforce these quotas when accepting new instances using Ironic flavors, without counting those instances towards the usual quotas. -In SAP Converged Cloud, Nova carries a custom patch set that triggers this behavior on presence of the `quota:instance_only` and `quota:separate` extra specs. +In SAP Cloud Infrastructure, Nova carries a custom patch set that triggers this behavior on presence of the `quota:instance_only` and `quota:separate` extra specs. ## Capacity calculation diff --git a/docs/liquids/nova.md b/docs/liquids/nova.md index 9824872e..fac5ab10 100644 --- a/docs/liquids/nova.md +++ b/docs/liquids/nova.md @@ -7,18 +7,131 @@ This liquid provides support for the compute service Nova. ## Service-specific configuration -TODO +| Field | Type | Description | +| ----- | ---- | ----------- | +| `binpack_behavior.score_ignores_cores`
`binpack_behavior.score_ignores_disk`
`binpack_behavior.score_ignores_ram` | boolean | If true, when ranking nodes during placement, do not include the respective dimension in the score. (This only affects instances of split flavors. [See below](#split-flavors) below for details.) | +| `flavor_selection.excluded_extra_specs` | map[string]string | Exclude flavors that have any of these extra specs. | +| `flavor_selection.required_extra_specs` | map[string]string | Only match flavors that have all of these extra specs. | +| `hypervisor_selection.aggregate_name_pattern` | regexp | Only match hypervisors that reside in an aggregate matching this pattern. If a hypervisor resides in multiple matching aggregates, an error is raised. It is recommended to use AZ-wide aggregates here. | +| `hypervisor_selection.hypervisor_type_pattern` | regexp | Only match hypervisors with a `hypervisor_type` attribute matching this pattern. | +| `hypervisor_selection.required_traits` | []string | Only those hypervisors will be considered whose resource providers have all of the traits without `!` prefix and none of those with `!` prefix. | +| `hypervisor_selection.shadowing_traits` | []string | If a hypervisor matches any of the rules in this configuration field (using the same logic as above for `required_traits`), the hypervisor will be considered shadowed. Its capacity will not be counted. (This affects capacity calculation for split flavors. [See below](#split-flavors) for details.) | +| `ignored_traits` | []string | Traits that should be ignored during confirmation that all pooled flavors agree on which trait-match extra specs they use. | +| `with_subcapacities` | boolean | If true, subcapacities are reported. | +| `with_subresources` | boolean | If true, subresources are reported. | ## Resources -TODO: Description +The standard roster of Nova quotas is supported: | Resource | Unit | Capabilities | | --- | --- | --- | | `cores` | None | HasCapacity = true, HasQuota = true | -| `ram` | MiB | HasCapacity = true, HasQuota = true | | `instances` | None | HasCapacity = true, HasQuota = true | +| `ram` | MiB | HasCapacity = true, HasQuota = true | | `server_groups` | None | HasCapacity = false, HasQuota = true | | `server_group_members` | None | HasCapacity = false, HasQuota = true | -| `instances_$FLAVOR_NAME` | None | HasCapacity = true, HasQuota = true | +Additionally, there is one resource for each flavor that carries the `quota:separate = "true"` extra spec: + +| Resource | Unit | Capabilities | +| --- | --- | --- | +| `instances_${FLAVOR_NAME}` | None | HasCapacity = true, HasQuota = true | + +These flavors are called **split flavors** in this documentation (since their quota is split from the usual quotas). +All other flavors are called **pooled flavors** (since they draw from the default quota pool for `cores`, `instances` and `ram`). +Resources for split flavors will not be spawned for Ironic flavors (those with extra spec `capabilities:hypervisor_type = "ironic"`). + +If `with_subresources` is set, each `instances` or `instances_${FLAVOR_NAME}` resource will have one subresource for each instance of the respective flavor(s), with the following fields: + +| Field | Type | Description | +| ----- | ---- | ----------- | +| `id` | string | The UUID of the Nova instance. | +| `name` | string | The human-readable name of the Nova instance. | +| `attributes.status` | string | The status of the instance according to Nova. | +| `attributes.metadata` | object of strings | User-supplied key-value data on this instance according to Nova. | +| `attributes.tags` | array of strings | User-supplied tags on this instance according to Nova. | +| `attributes.flavor.name` | string | The name of this instance's flavor. | +| `attributes.flavor.vcpu` | integer | The number of virtual cores available to this instance. | +| `attributes.flavor.ram_mib` | integer | The amount of RAM available to this instance, in MiB. | +| `attributes.flavor.disk_gib` | integer | The amount of local disk available to this instance, in GiB. | +| `attributes.flavor.video_ram_mib` | integer | The amount of video RAM available to this instance, in MiB. | +| `attributes.os_type` | string | The OS type, as inferred from the image that was used to boot this instance. [See below](#os-type-inference) for details. | + +**TODO:** There is incomplete pre-alpha-level support for `hw_version`-separated pooled quotas, which is not documented here yet until the implementation is completed. + +### Considerations for cloud operators + +If split flavors are used, Nova needs to be patched to ignore the usual quotas for instances of flavors with the `quota:separate = "true"` extra spec. +Instead, Nova must accept quotas with the same naming pattern (`instances_${FLAVOR_NAME}`), and only enforce these quotas when accepting new instances using Ironic flavors, without counting those instances towards the usual quotas. +In SAP Cloud Infrastructure, Nova carries a custom patch set that implements this behavior. + +### OS type inference + +On instance subresources, the `os_type` indicates which OS is likely running on the instance. +This is intended to be used for billing of OS licenses. + +For instances booted from an image, the image metadata is inspected in Glance. +The `os_type` is: (in order of priority) + +- `image-unknown`, if no valid image reference exists in the instance metadata +- `image-deleted`, if the image has been deleted since the instance was booted +- the value in the `vmware_ostype` attribute on the image metadata, if that field exists and the value is valid +- `$TYPE`, if the image metadata contains a tag of the form `ostype:$TYPE` +- `unknown`, if no other rule matches + +For instances booted using a Cinder volume as root disk, the volume metadata is inspected in Cinder by looking for volume attachment to `/dev/sda` or `/dev/vda`. +The `os_type` is: (in order of priority) + +- `rootdisk-missing`, if the boot volume has an empty ID +- `rootdisk-inspect-error`, if the boot volume cannot be located or if its metadata cannot be inspected in Glance +- the value in the `volume_image_metadata.vmware_ostype` attribute on the volume metadata, if that field exists and the value is valid +- `unknown`, if no other rule matches + +## Capacity calculation + +On the most basic level, pooled capacity is calculated by enumerating Nova hypervisors matching the configured `hypervisor_selection` and taking a sum over their total capacity: + +| Resource | Method | +| --- | --- | +| `compute/cores` | The sum of the reported CPUs for all matching hypervisors. Note that the hypervisor statistics reported by Nova do not take overcommit into account, so you may have to configure the overcommitment again in Limes for accurate capacity reporting. | +| `compute/instances` | Estimated per AZ as `10000 * count(matchingAggregates)`, but never more than `sumLocalDisk / maxDisk`, where `sumLocalDisk` is the sum of the local disk size for all matching hypervisors, and `maxDisk` is the largest disk requirement of all pooled flavors matching the configured `flavor_selection`. | +| `compute/ram` | The sum of the reported RAM for all matching hypervisors. | + +### Split flavors + +If there are split flavors (as defined above), the capacity for `compute/instances_${FLAVOR_NAME}` eats into the pooled capacity. +For example, if a split flavor named `foo` with 32 vCPUs is reported with a capacity of 5 instances, then the `compute/cores` capacity is reduced by `5 * 32 = 160` (and analogously for `compute/instances` and `compute/ram`). + +Capacity calculation is not as straight-forward as for pooled resources: +Nova and Placement only tell us about the size of the hypervisors in terms of CPU, RAM and local disk. +But the capacity needs to be reported in terms of number of instances that can be deployed per flavor. +There is no single correct answer to this because different flavors use different amounts of resources. + +Our calculation algorithm takes existing usage and confirmed commitments, as well as commitments that are waiting to be confirmed, +for all split-flavor resources and simulates placing those existing and requested instances onto the matching hypervisors. +Afterwards, any remaining space is filled up by following the existing distribution of flavors as closely as possible. +The resulting capacity is equal to how many instances could be placed in this simulation. +The placement simulation strives for an optimal result by using a binpacking algorithm. + +When pooled flavors and split flavors can be placed on the same hypervisor, +demand of equal or higher priority for pooled flavors is blocked while trying to place demand for split flavors. +When filling up the remaining space with extra split-flavor instances like at the end of Option 2, +extra instances are only placed into the "fair share" of split flavors when compared with pooled flavors. +For example, if current demand of CPU, RAM and disk is 10% in split flavors and 30% in pooled flavors, that's a ratio of 1:3. +Therefore, split-flavor instances will be filled up to 25% of the total space to leave 75% for pooled flavors, thus matching this ratio. + +#### A visual metaphor for capacity calculation + +It can be helpful to think of the hypervisors as vessels that can contain solids or liquids. +The split-flavor instances have fixed sizes (according to their flavor configuration), so they behave like solid blocks. +The pooled resources do not have a fixed size, so they behave like a liquid that can fill the gaps between the solid blocks. +When simulating placement: + +- We pour liquid into the hypervisor according to the usage for pooled resources. + (Usage comes first because it is the highest-priority form of demand.) +- We then put solid blocks into the hypervisor according to the usage for split flavors, + but only as long as the liquid does not overflow. +- All of this is then repeated for the other forms of demand, in decreasing order of priority: + first unused commitments, then pending commitments. +- Finally, if any hypervisors are not full, we fill up with solid blocks and liquid at the same time until we cannot anymore. diff --git a/internal/api/translation_test.go b/internal/api/translation_test.go index 23ce6953..fc00d2bd 100644 --- a/internal/api/translation_test.go +++ b/internal/api/translation_test.go @@ -150,6 +150,50 @@ func TestTranslateIronicSubcapacities(t *testing.T) { testSubcapacityTranslation(t, "ironic-flavors", extraSetup, subcapacitiesInLiquidFormat, subcapacitiesInLegacyFormat) } +func TestTranslateNovaSubcapacities(t *testing.T) { + subcapacitiesInLiquidFormat := []assert.JSONObject{ + { + "name": "nova-compute-bb91", + "capacity": 448, + "usage": 1101, + "attributes": assert.JSONObject{ + "aggregate_name": "vc-a-0", + "traits": []string{"COMPUTE_IMAGE_TYPE_ISO", "COMPUTE_IMAGE_TYPE_VMDK", "COMPUTE_NET_ATTACH_INTERFACE", "COMPUTE_NODE", "COMPUTE_RESCUE_BFV", "COMPUTE_SAME_HOST_COLD_MIGRATE", "CUSTOM_BIGVM_DISABLED"}, + }, + }, + { + "name": "nova-compute-bb274", + "capacity": 104, + "usage": 315, + "attributes": assert.JSONObject{ + "aggregate_name": "vc-a-1", + "traits": []string{"COMPUTE_IMAGE_TYPE_ISO", "COMPUTE_IMAGE_TYPE_VMDK", "COMPUTE_NET_ATTACH_INTERFACE", "COMPUTE_NODE", "COMPUTE_RESCUE_BFV", "COMPUTE_SAME_HOST_COLD_MIGRATE"}, + }, + }, + } + + subcapacitiesInLegacyFormat := []assert.JSONObject{ + { + "service_host": "nova-compute-bb91", + "az": "az-one", + "aggregate": "vc-a-0", + "capacity": 448, + "usage": 1101, + "traits": []string{"COMPUTE_IMAGE_TYPE_ISO", "COMPUTE_IMAGE_TYPE_VMDK", "COMPUTE_NET_ATTACH_INTERFACE", "COMPUTE_NODE", "COMPUTE_RESCUE_BFV", "COMPUTE_SAME_HOST_COLD_MIGRATE", "CUSTOM_BIGVM_DISABLED"}, + }, + { + "service_host": "nova-compute-bb274", + "az": "az-one", + "aggregate": "vc-a-1", + "capacity": 104, + "usage": 315, + "traits": []string{"COMPUTE_IMAGE_TYPE_ISO", "COMPUTE_IMAGE_TYPE_VMDK", "COMPUTE_NET_ATTACH_INTERFACE", "COMPUTE_NODE", "COMPUTE_RESCUE_BFV", "COMPUTE_SAME_HOST_COLD_MIGRATE"}, + }, + } + + testSubcapacityTranslation(t, "nova-flavors", nil, subcapacitiesInLiquidFormat, subcapacitiesInLegacyFormat) +} + func testSubcapacityTranslation(t *testing.T, ruleID string, extraSetup func(s *test.Setup), subcapacitiesInLiquidFormat, subcapacitiesInLegacyFormat []assert.JSONObject) { s := test.NewSetup(t, test.WithDBFixtureFile("fixtures/start-data-small.sql"), @@ -346,6 +390,106 @@ func TestTranslateIronicSubresources(t *testing.T) { testSubresourceTranslation(t, "ironic-flavors", extraSetup, subresourcesInLiquidFormat, subresourcesInLegacyFormat) } +func TestTranslateNovaSubresources(t *testing.T) { + subresourcesInLiquidFormat := []assert.JSONObject{ + { + "id": "c655dfeb-18fa-479d-b0bf-36cd63c2e901", + "name": "d042639-test-server", + "attributes": assert.JSONObject{ + "status": "ACTIVE", + "metadata": assert.JSONObject{ + "image_buildnumber": "", + "image_name": "SAP-compliant-ubuntu-24-04", + }, + "tags": []string{}, + "availability_zone": "az-one", + "flavor": assert.JSONObject{ + "name": "g_c1_m2_v2", + "vcpu": 1, + "ram_mib": 2032, + "disk_gib": 64, + "video_ram_mib": 16, + }, + "os_type": "image-deleted", + }, + }, + { + "id": "7cd0f695-75b5-4514-82a2-953237e4c7d6", + "name": "nsxt-e2e-test-vm-1", + "attributes": assert.JSONObject{ + "status": "ACTIVE", + "metadata": assert.JSONObject{}, + "tags": []string{}, + "availability_zone": "az-one", + "flavor": assert.JSONObject{ + "name": "g_c8_m16", + "vcpu": 8, + "ram_mib": 16368, + "disk_gib": 64, + "video_ram_mib": 16, + }, + "os_type": "image-deleted", + }, + }, + } + + subresourcesInLegacyFormat := []assert.JSONObject{ + { + "id": "c655dfeb-18fa-479d-b0bf-36cd63c2e901", + "name": "d042639-test-server", + "status": "ACTIVE", + "metadata": assert.JSONObject{ + "image_buildnumber": "", + "image_name": "SAP-compliant-ubuntu-24-04", + }, + "tags": []string{}, + "availability_zone": "az-one", + "hypervisor": "vmware", + "flavor": "g_c1_m2_v2", + "vcpu": 1, + "ram": assert.JSONObject{ + "value": 2032, + "unit": "MiB", + }, + "disk": assert.JSONObject{ + "value": 64, + "unit": "GiB", + }, + "video_ram": assert.JSONObject{ + "value": 16, + "unit": "MiB", + }, + "os_type": "image-deleted", + }, + { + "id": "7cd0f695-75b5-4514-82a2-953237e4c7d6", + "name": "nsxt-e2e-test-vm-1", + "status": "ACTIVE", + "metadata": assert.JSONObject{}, + "tags": []string{}, + "availability_zone": "az-one", + "hypervisor": "vmware", + "flavor": "g_c8_m16", + "vcpu": 8, + "ram": assert.JSONObject{ + "value": 16368, + "unit": "MiB", + }, + "disk": assert.JSONObject{ + "value": 64, + "unit": "GiB", + }, + "video_ram": assert.JSONObject{ + "value": 16, + "unit": "MiB", + }, + "os_type": "image-deleted", + }, + } + + testSubresourceTranslation(t, "nova-flavors", nil, subresourcesInLiquidFormat, subresourcesInLegacyFormat) +} + func testSubresourceTranslation(t *testing.T, ruleID string, extraSetup func(s *test.Setup), subresourcesInLiquidFormat, subresourcesInLegacyFormat []assert.JSONObject) { s := test.NewSetup(t, test.WithDBFixtureFile("fixtures/start-data-small.sql"), diff --git a/internal/core/translation_rule.go b/internal/core/translation_rule.go index 2aa836a3..975ed4e9 100644 --- a/internal/core/translation_rule.go +++ b/internal/core/translation_rule.go @@ -54,6 +54,8 @@ func NewTranslationRule(id string) (TranslationRule, error) { return TranslationRule{translateCinderOrManilaSubcapacities, nil}, nil case "ironic-flavors": return TranslationRule{translateIronicSubcapacities, translateIronicSubresources}, nil + case "nova-flavors": + return TranslationRule{translateNovaSubcapacities, translateNovaSubresources}, nil default: return TranslationRule{}, fmt.Errorf("no such TranslationRule: %q", id) } @@ -276,10 +278,6 @@ func translateIronicSubresources(input string, az limes.AvailabilityZone, resNam return input, nil } - if input == "" || input == "[]" { - return input, nil - } - var resAttrs struct { Cores uint64 `json:"cores"` MemoryMiB uint64 `json:"ram_mib"` @@ -341,3 +339,118 @@ func translateIronicSubresources(input string, az limes.AvailabilityZone, resNam buf, err := json.Marshal(outputs) return string(buf), err } + +func translateNovaSubcapacities(input string, az limes.AvailabilityZone, _ liquid.ResourceName, resInfo liquid.ResourceInfo) (string, error) { + if input == "" || input == "[]" { + return input, nil + } + + type newFormat struct { + ID string `json:"id"` + Name string `json:"name"` + Capacity uint64 `json:"capacity"` + Usage *uint64 `json:"usage"` + Attributes struct { + AggregateName string `json:"aggregate_name"` + Traits []string `json:"traits"` + } `json:"attributes"` + } + + var inputs []newFormat + err := json.Unmarshal([]byte(input), &inputs) + if err != nil { + return "", err + } + + type oldFormat struct { + ServiceHost string `json:"service_host"` + AvailabilityZone limes.AvailabilityZone `json:"az"` + AggregateName string `json:"aggregate"` + Capacity *uint64 `json:"capacity,omitempty"` + Usage *uint64 `json:"usage,omitempty"` + Traits []string `json:"traits"` + } + outputs := make([]oldFormat, len(inputs)) + for idx, in := range inputs { + out := oldFormat{ + ServiceHost: in.Name, + AvailabilityZone: az, + AggregateName: in.Attributes.AggregateName, + Capacity: &in.Capacity, + Usage: in.Usage, + Traits: in.Attributes.Traits, + } + outputs[idx] = out + } + buf, err := json.Marshal(outputs) + return string(buf), err +} + +func translateNovaSubresources(input string, az limes.AvailabilityZone, resName liquid.ResourceName, resInfo liquid.ResourceInfo) (string, error) { + if input == "" || input == "[]" { + return input, nil + } + + type newFormat struct { + ID string `json:"id"` + Name string `json:"name"` + Attributes struct { + Status string `json:"status"` + Metadata map[string]string `json:"metadata"` + Tags []string `json:"tags"` + AZ liquid.AvailabilityZone `json:"availability_zone"` + Flavor struct { + Name string `json:"name"` + VCPUs uint64 `json:"vcpu"` + MemoryMiB uint64 `json:"ram_mib"` + DiskGiB uint64 `json:"disk_gib"` + VideoMemoryMiB *uint64 `json:"video_ram_mib"` + } `json:"flavor"` + OSType string `json:"os_type"` + } + } + var inputs []newFormat + err := json.Unmarshal([]byte(input), &inputs) + if err != nil { + return "", err + } + + type oldFormat struct { + ID string `json:"id"` + Name string `json:"name"` + Status string `json:"status"` + Metadata map[string]string `json:"metadata"` + Tags []string `json:"tags"` + AZ limes.AvailabilityZone `json:"availability_zone"` + HypervisorType string `json:"hypervisor,omitempty"` + FlavorName string `json:"flavor"` + VCPUs uint64 `json:"vcpu"` + MemoryMiB limes.ValueWithUnit `json:"ram"` + DiskGiB limes.ValueWithUnit `json:"disk"` + VideoMemoryMiB *limes.ValueWithUnit `json:"video_ram,omitempty"` + OSType string `json:"os_type"` + } + outputs := make([]oldFormat, len(inputs)) + for idx, in := range inputs { + out := oldFormat{ + ID: in.ID, + Name: in.Name, + Status: in.Attributes.Status, + Metadata: in.Attributes.Metadata, + Tags: in.Attributes.Tags, + AZ: in.Attributes.AZ, + HypervisorType: "vmware", + FlavorName: in.Attributes.Flavor.Name, + VCPUs: in.Attributes.Flavor.VCPUs, + MemoryMiB: limes.ValueWithUnit{Unit: limes.UnitMebibytes, Value: in.Attributes.Flavor.MemoryMiB}, + DiskGiB: limes.ValueWithUnit{Unit: limes.UnitGibibytes, Value: in.Attributes.Flavor.DiskGiB}, + OSType: in.Attributes.OSType, + } + if in.Attributes.Flavor.VideoMemoryMiB != nil { + out.VideoMemoryMiB = &limes.ValueWithUnit{Unit: limes.UnitMebibytes, Value: *in.Attributes.Flavor.VideoMemoryMiB} + } + outputs[idx] = out + } + buf, err := json.Marshal(outputs) + return string(buf), err +} diff --git a/internal/liquids/nova/binpack_simulation.go b/internal/liquids/nova/binpack_simulation.go index ea53b1b1..aef36b97 100644 --- a/internal/liquids/nova/binpack_simulation.go +++ b/internal/liquids/nova/binpack_simulation.go @@ -26,20 +26,20 @@ import ( "github.com/gophercloud/gophercloud/v2/openstack/compute/v2/flavors" "github.com/gophercloud/gophercloud/v2/openstack/placement/v1/resourceproviders" - "github.com/sapcc/go-api-declarations/limes" "github.com/sapcc/go-api-declarations/liquid" "github.com/sapcc/go-bits/liquidapi" "github.com/sapcc/go-bits/logg" ) +// TODO: Remove yaml tags when switching to liquid-nova // BinpackBehavior contains configuration parameters for the binpack simulation. type BinpackBehavior struct { // When ranking nodes during placement, do not include the VCPU count dimension in the score. - ScoreIgnoresCores bool `yaml:"score_ignores_cores"` + ScoreIgnoresCores bool `yaml:"score_ignores_cores" json:"score_ignores_cores"` // When ranking nodes during placement, do not include the disk size dimension in the score. - ScoreIgnoresDisk bool `yaml:"score_ignores_disk"` + ScoreIgnoresDisk bool `yaml:"score_ignores_disk" json:"score_ignores_disk"` // When ranking nodes during placement, do not include the RAM size dimension in the score. - ScoreIgnoresRAM bool `yaml:"score_ignores_ram"` + ScoreIgnoresRAM bool `yaml:"score_ignores_ram" json:"score_ignores_ram"` } // BinpackHypervisor models an entire Nova hypervisor for the purposes of the @@ -145,7 +145,7 @@ func PrepareHypervisorForBinpacking(h MatchingHypervisor) (BinpackHypervisor, er } // RenderDebugView prints an overview of the placements in this hypervisor on several logg.Debug lines. -func (h BinpackHypervisor) RenderDebugView(az limes.AvailabilityZone) { +func (h BinpackHypervisor) RenderDebugView(az liquid.AvailabilityZone) { shortID := h.Match.Hypervisor.Service.Host logg.Debug("[%s][%s] %s", az, shortID, h.Match.Hypervisor.Description()) for idx, n := range h.Nodes { @@ -159,9 +159,9 @@ func (h BinpackHypervisor) RenderDebugView(az limes.AvailabilityZone) { } // PlaceSeveralInstances calls PlaceOneInstance multiple times. -func (hh BinpackHypervisors) PlaceSeveralInstances(f flavors.Flavor, reason string, coresOvercommitFactor liquid.OvercommitFactor, blockedCapacity BinpackVector[uint64], bb BinpackBehavior, count uint64) (ok bool) { +func (hh BinpackHypervisors) PlaceSeveralInstances(f flavors.Flavor, reason string, coresOvercommitFactor liquid.OvercommitFactor, blockedCapacity BinpackVector[uint64], bb BinpackBehavior, skipTraitMatch bool, count uint64) (ok bool) { for range count { - ok = hh.PlaceOneInstance(f, reason, coresOvercommitFactor, blockedCapacity, bb) + ok = hh.PlaceOneInstance(f, reason, coresOvercommitFactor, blockedCapacity, bb, skipTraitMatch) if !ok { // if we don't have space for this instance, we won't have space for any following ones return false @@ -172,7 +172,7 @@ func (hh BinpackHypervisors) PlaceSeveralInstances(f flavors.Flavor, reason stri // PlaceOneInstance places a single instance of the given flavor using the vector-dot binpacking algorithm. // If the instance cannot be placed, false is returned. -func (hh BinpackHypervisors) PlaceOneInstance(flavor flavors.Flavor, reason string, coresOvercommitFactor liquid.OvercommitFactor, blockedCapacity BinpackVector[uint64], bb BinpackBehavior) (ok bool) { +func (hh BinpackHypervisors) PlaceOneInstance(flavor flavors.Flavor, reason string, coresOvercommitFactor liquid.OvercommitFactor, blockedCapacity BinpackVector[uint64], bb BinpackBehavior, skipTraitMatch bool) (ok bool) { // This function implements the vector dot binpacking method described in [Mayank] (section III, // subsection D, including the correction presented in the last paragraph of that subsection). // @@ -212,7 +212,7 @@ func (hh BinpackHypervisors) PlaceOneInstance(flavor flavors.Flavor, reason stri ) for _, hypervisor := range hh { // skip hypervisors that the flavor does not accept - if !FlavorMatchesHypervisor(flavor, hypervisor.Match) { + if !skipTraitMatch && !FlavorMatchesHypervisor(flavor, hypervisor.Match) { continue } diff --git a/internal/liquids/nova/capacity.go b/internal/liquids/nova/capacity.go index 89dc4698..f704af64 100644 --- a/internal/liquids/nova/capacity.go +++ b/internal/liquids/nova/capacity.go @@ -20,9 +20,20 @@ package nova import ( + "context" + "errors" "fmt" + "slices" + "strings" + + "github.com/gophercloud/gophercloud/v2/openstack/compute/v2/flavors" + "github.com/gophercloud/gophercloud/v2/openstack/compute/v2/servers" + "github.com/sapcc/go-api-declarations/liquid" + "github.com/sapcc/go-bits/liquidapi" + "github.com/sapcc/go-bits/logg" "github.com/sapcc/limes/internal/core" + "github.com/sapcc/limes/internal/liquids" ) // PartialCapacity describes compute capacity at a level below the entire @@ -66,7 +77,8 @@ func (c PartialCapacity) CappedToUsage() PartialCapacity { } } -func (c PartialCapacity) IntoCapacityData(resourceName string, maxRootDiskSize float64, subcapacities []any) core.CapacityData { +// TODO: Remove when switching to liquid-nova +func (c PartialCapacity) DeprecatedIntoCapacityData(resourceName string, maxRootDiskSize float64, subcapacities []any) core.CapacityData { switch resourceName { case "cores": return core.CapacityData{ @@ -98,6 +110,37 @@ func (c PartialCapacity) IntoCapacityData(resourceName string, maxRootDiskSize f } } +// TODO: Remove nolint:dupl when switching to liquid-nova +func (c PartialCapacity) IntoCapacityData(resourceName string, maxRootDiskSize float64, subcapacities []liquid.Subcapacity) liquid.AZResourceCapacityReport { + switch resourceName { + case "cores": + return liquid.AZResourceCapacityReport{ + Capacity: c.VCPUs.Capacity, + Usage: &c.VCPUs.Usage, + Subcapacities: subcapacities, + } + case "ram": + return liquid.AZResourceCapacityReport{ + Capacity: c.MemoryMB.Capacity, + Usage: &c.MemoryMB.Usage, + Subcapacities: subcapacities, + } + case "instances": + amount := 10000 * uint64(len(c.MatchingAggregates)) + if maxRootDiskSize != 0 { + maxAmount := uint64(float64(c.LocalGB.Capacity) / maxRootDiskSize) + amount = min(amount, maxAmount) + } + return liquid.AZResourceCapacityReport{ + Capacity: amount, + Usage: &c.RunningVMs, + Subcapacities: subcapacities, + } + default: + panic(fmt.Sprintf("called with unknown resourceName %q", resourceName)) + } +} + // PartialCapacityMetric appears in type PartialCapacity. type PartialCapacityMetric struct { Capacity uint64 @@ -110,3 +153,382 @@ func (m PartialCapacityMetric) CappedToUsage() PartialCapacityMetric { Usage: m.Usage, } } + +// ScanCapacity implements the liquidapi.Logic interface. +func (l *Logic) ScanCapacity(ctx context.Context, req liquid.ServiceCapacityRequest, serviceInfo liquid.ServiceInfo) (liquid.ServiceCapacityReport, error) { + // enumerate matching flavors, divide into split and pooled flavors; + // ("split flavors" are those with separate instance quota, as opposed to + // "pooled flavors" that share a common pool of CPU/instances/RAM capacity) + // + // also, for the pooled instances capacity, we need to know the max root disk size on public pooled flavors + var ( + splitFlavors []flavors.Flavor + maxRootDiskSize = uint64(0) + ) + pooledExtraSpecs := make(map[string]string) + err := l.FlavorSelection.ForeachFlavor(ctx, l.NovaV2, func(f flavors.Flavor) error { + switch { + case IsIronicFlavor(f): + // ignore Ironic flavors + case IsSplitFlavor(f): + splitFlavors = append(splitFlavors, f) + case f.IsPublic: + // require that all pooled flavors agree on the same trait-match extra specs + for spec, val := range f.ExtraSpecs { + trait, matches := strings.CutPrefix(spec, "trait:") + if !matches || slices.Contains(l.IgnoredTraits, trait) { + continue + } + if pooledVal, exists := pooledExtraSpecs[spec]; !exists { + pooledExtraSpecs[spec] = val + } else if val != pooledVal { + return fmt.Errorf("conflict: pooled flavors both require extra spec %s values %s and %s", spec, val, pooledVal) + } + } + // only public flavor contribute to the `maxRootDiskSize` calculation (in + // the wild, we have seen non-public legacy flavors with wildly large + // disk sizes that would throw off all estimates derived from this number) + maxRootDiskSize = max(maxRootDiskSize, liquidapi.AtLeastZero(f.Disk)) + } + return nil + }) + if err != nil { + return liquid.ServiceCapacityReport{}, err + } + if maxRootDiskSize == 0 { + return liquid.ServiceCapacityReport{}, errors.New("pooled capacity requested, but there are no matching flavors") + } + logg.Debug("max root disk size = %d GiB", maxRootDiskSize) + + // collect all relevant resource demands + coresDemand := req.DemandByResource["cores"] + instancesDemand := req.DemandByResource["instances"] + ramDemand := req.DemandByResource["ram"] + + logg.Debug("pooled cores demand: %#v (overcommit factor = %g)", coresDemand.PerAZ, coresDemand.OvercommitFactor) + logg.Debug("pooled instances demand: %#v", instancesDemand.PerAZ) + logg.Debug("pooled RAM demand: %#v", ramDemand.PerAZ) + + demandByFlavorName := make(map[string]liquid.ResourceDemand) + for _, f := range splitFlavors { + resourceName := ResourceNameForFlavor(f.Name) + demand := req.DemandByResource[resourceName] + if demand.OvercommitFactor != 1 && demand.OvercommitFactor != 0 { + return liquid.ServiceCapacityReport{}, fmt.Errorf("overcommit on compute/%s is not supported", resourceName) + } + demandByFlavorName[f.Name] = demand + } + logg.Debug("binpackable flavors: %#v", splitFlavors) + logg.Debug("demand for binpackable flavors: %#v", demandByFlavorName) + + // enumerate matching hypervisors, prepare data structures for binpacking + hypervisorsByAZ := make(map[liquid.AvailabilityZone]BinpackHypervisors) + shadowedHypervisorsByAZ := make(map[liquid.AvailabilityZone][]MatchingHypervisor) + isShadowedHVHostname := make(map[string]bool) + err = l.HypervisorSelection.ForeachHypervisor(ctx, l.NovaV2, l.PlacementV1, func(h MatchingHypervisor) error { + // ignore HVs that are not associated with an aggregate and AZ + if !h.CheckTopology() { + return nil + } + + if h.ShadowedByTrait == "" { + bh, err := PrepareHypervisorForBinpacking(h) + if err != nil { + return err + } + hypervisorsByAZ[h.AvailabilityZone] = append(hypervisorsByAZ[h.AvailabilityZone], bh) + + hc := h.PartialCapacity() + logg.Debug("%s in %s reports %s capacity, %s used, %d nodes, %s max unit", h.Hypervisor.Description(), h.AvailabilityZone, + BinpackVector[uint64]{VCPUs: hc.VCPUs.Capacity, MemoryMB: hc.MemoryMB.Capacity, LocalGB: hc.LocalGB.Capacity}, + BinpackVector[uint64]{VCPUs: hc.VCPUs.Usage, MemoryMB: hc.MemoryMB.Usage, LocalGB: hc.LocalGB.Usage}, + len(bh.Nodes), bh.Nodes[0].Capacity, + ) + } else { + shadowedHypervisorsByAZ[h.AvailabilityZone] = append(shadowedHypervisorsByAZ[h.AvailabilityZone], h) + isShadowedHVHostname[h.Hypervisor.HypervisorHostname] = true + logg.Debug("%s in %s is shadowed by trait %s", h.Hypervisor.Description(), h.AvailabilityZone, h.ShadowedByTrait) + } + + return nil + }) + if err != nil { + return liquid.ServiceCapacityReport{}, err + } + + // during binpacking, place instances of large flavors first to achieve optimal results + slices.SortFunc(splitFlavors, func(lhs, rhs flavors.Flavor) int { + //NOTE: this returns `rhs-lhs` instead of `lhs-rhs` to achieve descending order + if lhs.VCPUs != rhs.VCPUs { + return rhs.VCPUs - lhs.VCPUs + } + if lhs.RAM != rhs.RAM { + return rhs.RAM - lhs.RAM + } + return rhs.Disk - lhs.Disk + }) + + // if Nova can tell us where existing instances are running, we prefer this + // information since it will make our simulation more accurate + instancesPlacedOnShadowedHypervisors := make(map[string]map[liquid.AvailabilityZone]uint64) // first key is flavor name + bb := l.BinpackBehavior + for _, flavor := range splitFlavors { + shadowedForThisFlavor := make(map[liquid.AvailabilityZone]uint64) + + // list all servers for this flavor, parsing only placement information from the result + listOpts := servers.ListOpts{ + Flavor: flavor.ID, + AllTenants: true, + } + allPages, err := servers.List(l.NovaV2, listOpts).AllPages(ctx) + if err != nil { + return liquid.ServiceCapacityReport{}, fmt.Errorf("while listing active instances for flavor %s: %w", flavor.Name, err) + } + var instances []struct { + ID string `json:"id"` + AZ string `json:"OS-EXT-AZ:availability_zone"` + HypervisorHostname string `json:"OS-EXT-SRV-ATTR:hypervisor_hostname"` + } + err = servers.ExtractServersInto(allPages, &instances) + if err != nil { + return liquid.ServiceCapacityReport{}, fmt.Errorf("while listing active instances for flavor %s: %w", flavor.Name, err) + } + + for _, instance := range instances { + az := liquid.NormalizeAZ(instance.AZ, req.AllAZs) + + // If we are absolutely sure that this instance is placed on a shadowed hypervisor, + // we remember this and have the final capacity take those into account without + // including them in the binpacking simulation. + if isShadowedHVHostname[instance.HypervisorHostname] { + shadowedForThisFlavor[az]++ + continue + } + + // If the instance is placed on a known hypervisor, place it right now. + // The number of instances thus placed will be skipped below to avoid double counting. + for _, hv := range hypervisorsByAZ[az] { + if hv.Match.Hypervisor.HypervisorHostname == instance.HypervisorHostname { + var zero BinpackVector[uint64] + placed := BinpackHypervisors{hv}.PlaceOneInstance(flavor, "USED", coresDemand.OvercommitFactor, zero, bb, true) + if !placed { + logg.Debug("could not simulate placement of known instance %s on %s", instance.ID, hv.Match.Hypervisor.Description()) + } + break + } + } + } + + if len(shadowedForThisFlavor) > 0 { + instancesPlacedOnShadowedHypervisors[flavor.Name] = shadowedForThisFlavor + } + } + logg.Debug("instances for split flavors placed on shadowed hypervisors: %v", instancesPlacedOnShadowedHypervisors) + + // foreach AZ, place demanded split instances in order of priority, unless + // blocked by pooled instances of equal or higher priority + for az, hypervisors := range hypervisorsByAZ { + canPlaceFlavor := make(map[string]bool) + for _, flavor := range splitFlavors { + canPlaceFlavor[flavor.Name] = true + } + + // phase 1: block existing usage + blockedCapacity := BinpackVector[uint64]{ + VCPUs: coresDemand.OvercommitFactor.ApplyInReverseTo(coresDemand.PerAZ[az].Usage), + MemoryMB: ramDemand.PerAZ[az].Usage, + LocalGB: instancesDemand.PerAZ[az].Usage * maxRootDiskSize, + } + logg.Debug("[%s] blockedCapacity in phase 1: %s", az, blockedCapacity.String()) + for _, flavor := range splitFlavors { + // do not place instances that have already been placed in the simulation, + // as well as instances that run on hypervisors that do not participate in the binpacking simulation + placedUsage := hypervisors.PlacementCountForFlavor(flavor.Name) + shadowedUsage := instancesPlacedOnShadowedHypervisors[flavor.Name][az] + unplacedUsage := liquidapi.SaturatingSub(demandByFlavorName[flavor.Name].PerAZ[az].Usage, placedUsage+shadowedUsage) + if !hypervisors.PlaceSeveralInstances(flavor, "used", coresDemand.OvercommitFactor, blockedCapacity, bb, false, unplacedUsage) { + canPlaceFlavor[flavor.Name] = false + } + } + + // phase 2: block confirmed, but unused commitments + blockedCapacity.VCPUs += coresDemand.OvercommitFactor.ApplyInReverseTo(coresDemand.PerAZ[az].UnusedCommitments) + blockedCapacity.MemoryMB += ramDemand.PerAZ[az].UnusedCommitments + blockedCapacity.LocalGB += instancesDemand.PerAZ[az].UnusedCommitments * maxRootDiskSize + logg.Debug("[%s] blockedCapacity in phase 2: %s", az, blockedCapacity.String()) + for _, flavor := range splitFlavors { + if !hypervisors.PlaceSeveralInstances(flavor, "committed", coresDemand.OvercommitFactor, blockedCapacity, bb, false, demandByFlavorName[flavor.Name].PerAZ[az].UnusedCommitments) { + canPlaceFlavor[flavor.Name] = false + } + } + + // phase 3: block pending commitments + blockedCapacity.VCPUs += coresDemand.OvercommitFactor.ApplyInReverseTo(coresDemand.PerAZ[az].PendingCommitments) + blockedCapacity.MemoryMB += ramDemand.PerAZ[az].PendingCommitments + blockedCapacity.LocalGB += instancesDemand.PerAZ[az].PendingCommitments * maxRootDiskSize + logg.Debug("[%s] blockedCapacity in phase 3: %s", az, blockedCapacity.String()) + for _, flavor := range splitFlavors { + if !hypervisors.PlaceSeveralInstances(flavor, "pending", coresDemand.OvercommitFactor, blockedCapacity, bb, false, demandByFlavorName[flavor.Name].PerAZ[az].PendingCommitments) { + canPlaceFlavor[flavor.Name] = false + } + } + + // check how many instances we could place until now + initiallyPlacedInstances := make(map[string]float64) + sumInitiallyPlacedInstances := uint64(0) + totalPlacedInstances := make(map[string]float64) // these two will diverge in the final round of placements + var splitFlavorsUsage BinpackVector[uint64] + for _, flavor := range splitFlavors { + count := hypervisors.PlacementCountForFlavor(flavor.Name) + initiallyPlacedInstances[flavor.Name] = max(float64(count), 0.1) + sumInitiallyPlacedInstances += count + totalPlacedInstances[flavor.Name] = float64(count) + // The max(..., 0.1) is explained below. + + splitFlavorsUsage.VCPUs += coresDemand.OvercommitFactor.ApplyInReverseTo(count * liquidapi.AtLeastZero(flavor.VCPUs)) + splitFlavorsUsage.MemoryMB += count * liquidapi.AtLeastZero(flavor.RAM) + splitFlavorsUsage.LocalGB += count * liquidapi.AtLeastZero(flavor.Disk) + } + + // for the upcoming final fill, we want to block capacity in such a way that + // the reported capacity is fairly divided between pooled and split flavors, + // in a way that matches the existing usage distribution, that is: + // + // capacity blocked for pooled flavors = capacity * (pooled usage / total usage) + // ------------ + // ^ this is in blockedCapacity + // + totalUsageUntilNow := blockedCapacity.Add(splitFlavorsUsage) + if !totalUsageUntilNow.IsAnyZero() { + // we can only do this if .Div() does not cause a divide-by-zero, otherwise we continue with blockedCapacity = 0 + blockedCapacity = hypervisors.TotalCapacity().AsFloat().Mul(blockedCapacity.Div(totalUsageUntilNow)).AsUint() + } + logg.Debug("[%s] usage by split flavors after phase 3: %s", az, splitFlavorsUsage.String()) + logg.Debug("[%s] blockedCapacity in final fill: %s (totalCapacity = %s)", az, blockedCapacity.String(), hypervisors.TotalCapacity().String()) + + // Fill up with padding in a fair way as long as there is space left. + // This uses the Sainte-Laguë method designed for allocation of parliament + // seats. In this case, the parties are the flavors, the votes are what we + // allocated based on demand (`initiallyPlacedInstances`), and the seats are + // the placements (`totalPlacedInstances`). + for { + var ( + bestFlavor *flavors.Flavor + bestScore = -1.0 + ) + for _, flavor := range splitFlavors { + if !canPlaceFlavor[flavor.Name] { + continue + } + score := (initiallyPlacedInstances[flavor.Name]) / (2*totalPlacedInstances[flavor.Name] + 1) + // ^ This is why we adjusted all initiallyPlacedInstances[flavor.Name] = 0 to 0.1 + // above. If the nominator of this fraction is 0 for multiple flavors, the first + // (biggest) flavor always wins unfairly. By adjusting to slightly away from zero, + // the scoring is more fair and stable. + if score > bestScore { + bestScore = score + flavor := flavor + bestFlavor = &flavor + } + } + if bestFlavor == nil { + // no flavor left that can be placed -> stop + break + } else { + if hypervisors.PlaceOneInstance(*bestFlavor, "padding", coresDemand.OvercommitFactor, blockedCapacity, bb, false) { + totalPlacedInstances[bestFlavor.Name]++ + } else { + canPlaceFlavor[bestFlavor.Name] = false + } + } + } + } ////////// end of placement + + // debug visualization of the binpack placement result + if logg.ShowDebug { + for az, hypervisors := range hypervisorsByAZ { + for _, hypervisor := range hypervisors { + hypervisor.RenderDebugView(az) + } + } + } + + // compile result for pooled resources + capacities := make(map[liquid.ResourceName]*liquid.ResourceCapacityReport, len(splitFlavors)+3) + capacities["cores"] = &liquid.ResourceCapacityReport{ + PerAZ: make(map[liquid.AvailabilityZone]*liquid.AZResourceCapacityReport), + } + capacities["instances"] = &liquid.ResourceCapacityReport{ + PerAZ: make(map[liquid.AvailabilityZone]*liquid.AZResourceCapacityReport), + } + capacities["ram"] = &liquid.ResourceCapacityReport{ + PerAZ: make(map[liquid.AvailabilityZone]*liquid.AZResourceCapacityReport), + } + + for az, hypervisors := range hypervisorsByAZ { + var ( + azCapacity PartialCapacity + builder PooledSubcapacityBuilder + ) + for _, h := range hypervisors { + azCapacity.Add(h.Match.PartialCapacity()) + if l.WithSubcapacities { + err = builder.AddHypervisor(h.Match, float64(maxRootDiskSize)) + if err != nil { + return liquid.ServiceCapacityReport{}, fmt.Errorf("could not add hypervisor as subcapacity: %w", err) + } + } + } + for _, h := range shadowedHypervisorsByAZ[az] { + azCapacity.Add(h.PartialCapacity().CappedToUsage()) + if l.WithSubcapacities { + err = builder.AddHypervisor(h, float64(maxRootDiskSize)) + if err != nil { + return liquid.ServiceCapacityReport{}, fmt.Errorf("could not add hypervisor as subcapacity: %w", err) + } + } + } + + capacities["cores"].PerAZ[az] = liquids.PointerTo(azCapacity.IntoCapacityData("cores", float64(maxRootDiskSize), builder.CoresSubcapacities)) + capacities["instances"].PerAZ[az] = liquids.PointerTo(azCapacity.IntoCapacityData("instances", float64(maxRootDiskSize), builder.InstancesSubcapacities)) + capacities["ram"].PerAZ[az] = liquids.PointerTo(azCapacity.IntoCapacityData("ram", float64(maxRootDiskSize), builder.RAMSubcapacities)) + for _, flavor := range splitFlavors { + count := hypervisors.PlacementCountForFlavor(flavor.Name) + capacities["cores"].PerAZ[az].Capacity -= coresDemand.OvercommitFactor.ApplyInReverseTo(count * liquidapi.AtLeastZero(flavor.VCPUs)) + capacities["instances"].PerAZ[az].Capacity-- //TODO: not accurate when uint64(flavor.Disk) != maxRootDiskSize + capacities["ram"].PerAZ[az].Capacity -= count * liquidapi.AtLeastZero(flavor.RAM) + } + } + + // compile result for split flavors + for _, flavor := range splitFlavors { + resourceName := ResourceNameForFlavor(flavor.Name) + capacities[resourceName] = &liquid.ResourceCapacityReport{ + PerAZ: make(map[liquid.AvailabilityZone]*liquid.AZResourceCapacityReport), + } + + for az, hypervisors := range hypervisorsByAZ { + capacities[resourceName].PerAZ[az] = &liquid.AZResourceCapacityReport{ + Capacity: hypervisors.PlacementCountForFlavor(flavor.Name), + } + } + + // if shadowed hypervisors are still carrying instances of this flavor, + // increase the capacity accordingly to more accurately represent the + // free capacity on the unshadowed hypervisors + for az, shadowedCount := range instancesPlacedOnShadowedHypervisors[flavor.Name] { + if capacities[resourceName].PerAZ[az] == nil { + capacities[resourceName].PerAZ[az] = &liquid.AZResourceCapacityReport{ + Capacity: shadowedCount, + } + } else { + capacities[resourceName].PerAZ[az].Capacity += shadowedCount + } + } + } + + return liquid.ServiceCapacityReport{ + InfoVersion: serviceInfo.Version, + Resources: capacities, + }, nil +} diff --git a/internal/liquids/nova/flavor_selection.go b/internal/liquids/nova/flavor_selection.go index 1b90f3a4..e917cde9 100644 --- a/internal/liquids/nova/flavor_selection.go +++ b/internal/liquids/nova/flavor_selection.go @@ -30,6 +30,7 @@ import ( "github.com/sapcc/go-api-declarations/liquid" ) +// TODO: Remove yaml tags when switching to liquid-nova // FlavorSelection describes a set of public flavors. // // This is used for matching flavors that we enumerate via the flavor API @@ -37,9 +38,9 @@ import ( // name, type FlavorNameSelection is used. type FlavorSelection struct { // Only match flavors that have all of these extra specs. - RequiredExtraSpecs map[string]string `yaml:"required_extra_specs"` + RequiredExtraSpecs map[string]string `yaml:"required_extra_specs" json:"required_extra_specs"` // Exclude flavors that have any of these extra specs. - ExcludedExtraSpecs map[string]string `yaml:"excluded_extra_specs"` + ExcludedExtraSpecs map[string]string `yaml:"excluded_extra_specs" json:"excluded_extra_specs"` } func (s FlavorSelection) matchesExtraSpecs(specs map[string]string) bool { diff --git a/internal/liquids/nova/hypervisor_selection.go b/internal/liquids/nova/hypervisor_selection.go index 981829d7..389ba377 100644 --- a/internal/liquids/nova/hypervisor_selection.go +++ b/internal/liquids/nova/hypervisor_selection.go @@ -35,19 +35,20 @@ import ( "github.com/sapcc/go-bits/regexpext" ) +// TODO: Remove yaml tags when switching to liquid-nova // HypervisorSelection describes a set of hypervisors. type HypervisorSelection struct { // Only match hypervisors with a hypervisor_type attribute matching this pattern. - HypervisorTypeRx regexpext.PlainRegexp `yaml:"hypervisor_type_pattern"` + HypervisorTypeRx regexpext.PlainRegexp `yaml:"hypervisor_type_pattern" json:"hypervisor_type_pattern"` // Only match hypervisors that have any of these traits. // Trait names can include a `!` prefix to invert the match. - RequiredTraits []string `yaml:"required_traits"` + RequiredTraits []string `yaml:"required_traits" json:"required_traits"` // Set the MatchingHypervisor.ShadowedByTrait field on hypervisors that have any of these traits. // Trait names can include a `!` prefix to invert the match. - ShadowingTraits []string `yaml:"shadowing_traits"` + ShadowingTraits []string `yaml:"shadowing_traits" json:"shadowing_traits"` // Only match hypervisors that reside in an aggregate matching this pattern. // If a hypervisor resides in multiple matching aggregates, an error is raised. - AggregateNameRx regexpext.PlainRegexp `yaml:"aggregate_name_pattern"` + AggregateNameRx regexpext.PlainRegexp `yaml:"aggregate_name_pattern" json:"aggregate_name_pattern"` } // ForeachHypervisor lists all Nova hypervisors matching this diff --git a/internal/liquids/nova/hypervisor_subcapacity.go b/internal/liquids/nova/hypervisor_subcapacity.go index 26fb3251..5c7b3cf7 100644 --- a/internal/liquids/nova/hypervisor_subcapacity.go +++ b/internal/liquids/nova/hypervisor_subcapacity.go @@ -20,15 +20,19 @@ package nova import ( + "encoding/json" + "fmt" + "github.com/sapcc/go-api-declarations/limes" + "github.com/sapcc/go-api-declarations/liquid" ) -// Subcapacity is the structure for subcapacities reported by the "nova" capacity plugin. +// DeprecatedSubcapacity is the structure for subcapacities reported by the "nova" capacity plugin. // Each subcapacity refers to a single Nova hypervisor. // // This structure can appear on both pooled resources (using the Capacity and Usage fields to report only one dimension at a time), // or on split flavors (using the CapacityVector and UsageVector fields to report all dimensions at once). -type Subcapacity struct { +type DeprecatedSubcapacity struct { ServiceHost string `json:"service_host"` AvailabilityZone limes.AvailabilityZone `json:"az"` AggregateName string `json:"aggregate"` @@ -39,8 +43,9 @@ type Subcapacity struct { Traits []string `json:"traits"` } +// TODO: Remove when switching to liquid-nova // PooledSubcapacityBuilder is used to build subcapacity lists for pooled resources. -type PooledSubcapacityBuilder struct { +type DeprecatedPooledSubcapacityBuilder struct { // These are actually []Subcapacity, but we store them as []any because // that's what goes into type core.CapacityData in the end. CoresSubcapacities []any @@ -48,11 +53,24 @@ type PooledSubcapacityBuilder struct { RAMSubcapacities []any } -func (b *PooledSubcapacityBuilder) AddHypervisor(h MatchingHypervisor, maxRootDiskSize float64) { +// PooledSubcapacityBuilder is used to build subcapacity lists for pooled resources. +type PooledSubcapacityBuilder struct { + CoresSubcapacities []liquid.Subcapacity + InstancesSubcapacities []liquid.Subcapacity + RAMSubcapacities []liquid.Subcapacity +} + +type SubcapacityAttributes struct { + AggregateName string `json:"aggregate_name"` + Traits []string `json:"traits"` +} + +// TODO: Remove when switching to liquid-nova +func (b *DeprecatedPooledSubcapacityBuilder) AddHypervisor(h MatchingHypervisor, maxRootDiskSize float64) { pc := h.PartialCapacity() hvCoresCapa := pc.IntoCapacityData("cores", maxRootDiskSize, nil) - b.CoresSubcapacities = append(b.CoresSubcapacities, Subcapacity{ + b.CoresSubcapacities = append(b.CoresSubcapacities, DeprecatedSubcapacity{ ServiceHost: h.Hypervisor.Service.Host, AggregateName: h.AggregateName, AvailabilityZone: h.AvailabilityZone, @@ -61,7 +79,7 @@ func (b *PooledSubcapacityBuilder) AddHypervisor(h MatchingHypervisor, maxRootDi Traits: h.Traits, }) hvInstancesCapa := pc.IntoCapacityData("instances", maxRootDiskSize, nil) - b.InstancesSubcapacities = append(b.InstancesSubcapacities, Subcapacity{ + b.InstancesSubcapacities = append(b.InstancesSubcapacities, DeprecatedSubcapacity{ ServiceHost: h.Hypervisor.Service.Host, AggregateName: h.AggregateName, AvailabilityZone: h.AvailabilityZone, @@ -70,7 +88,7 @@ func (b *PooledSubcapacityBuilder) AddHypervisor(h MatchingHypervisor, maxRootDi Traits: h.Traits, }) hvRAMCapa := pc.IntoCapacityData("ram", maxRootDiskSize, nil) - b.RAMSubcapacities = append(b.RAMSubcapacities, Subcapacity{ + b.RAMSubcapacities = append(b.RAMSubcapacities, DeprecatedSubcapacity{ ServiceHost: h.Hypervisor.Service.Host, AggregateName: h.AggregateName, AvailabilityZone: h.AvailabilityZone, @@ -80,15 +98,53 @@ func (b *PooledSubcapacityBuilder) AddHypervisor(h MatchingHypervisor, maxRootDi }) } +func (b *PooledSubcapacityBuilder) AddHypervisor(h MatchingHypervisor, maxRootDiskSize float64) error { + pc := h.PartialCapacity() + + attrs := SubcapacityAttributes{ + AggregateName: h.AggregateName, + Traits: h.Traits, + } + buf, err := json.Marshal(attrs) + if err != nil { + return fmt.Errorf("while serializing Subcapacity Attributes: %w", err) + } + + hvCoresCapa := pc.IntoCapacityData("cores", maxRootDiskSize, nil) + b.CoresSubcapacities = append(b.CoresSubcapacities, liquid.Subcapacity{ + Name: h.Hypervisor.Service.Host, + Capacity: hvCoresCapa.Capacity, + Usage: hvCoresCapa.Usage, + Attributes: json.RawMessage(buf), + }) + hvInstancesCapa := pc.IntoCapacityData("instances", maxRootDiskSize, nil) + b.InstancesSubcapacities = append(b.InstancesSubcapacities, liquid.Subcapacity{ + Name: h.Hypervisor.Service.Host, + Capacity: hvInstancesCapa.Capacity, + Usage: hvInstancesCapa.Usage, + Attributes: json.RawMessage(buf), + }) + hvRAMCapa := pc.IntoCapacityData("ram", maxRootDiskSize, nil) + b.RAMSubcapacities = append(b.RAMSubcapacities, liquid.Subcapacity{ + Name: h.Hypervisor.Service.Host, + Capacity: hvRAMCapa.Capacity, + Usage: hvRAMCapa.Usage, + Attributes: json.RawMessage(buf), + }) + + return nil +} + +// TODO: Remove when switching to liquid-nova // PooledSubcapacityBuilder is used to build subcapacity lists for split flavors. // These subcapacities are reported on the first flavor in alphabetic order. -type SplitFlavorSubcapacityBuilder struct { +type DeprecatedSplitFlavorSubcapacityBuilder struct { Subcapacities []any } -func (b *SplitFlavorSubcapacityBuilder) AddHypervisor(h MatchingHypervisor) { +func (b *DeprecatedSplitFlavorSubcapacityBuilder) AddHypervisor(h MatchingHypervisor) { pc := h.PartialCapacity() - b.Subcapacities = append(b.Subcapacities, Subcapacity{ + b.Subcapacities = append(b.Subcapacities, DeprecatedSubcapacity{ ServiceHost: h.Hypervisor.Service.Host, AggregateName: h.AggregateName, AvailabilityZone: h.AvailabilityZone, diff --git a/internal/liquids/nova/liquid.go b/internal/liquids/nova/liquid.go index 531e5626..d9ebe9c7 100644 --- a/internal/liquids/nova/liquid.go +++ b/internal/liquids/nova/liquid.go @@ -21,7 +21,6 @@ package nova import ( "context" - "errors" "fmt" "regexp" "slices" @@ -38,9 +37,15 @@ import ( type Logic struct { // configuration - WithSubresources bool `json:"with_subresources"` + HypervisorSelection HypervisorSelection `json:"hypervisor_selection"` + FlavorSelection FlavorSelection `json:"flavor_selection"` + WithSubresources bool `json:"with_subresources"` + WithSubcapacities bool `json:"with_subcapacities"` + BinpackBehavior BinpackBehavior `json:"binpack_behavior"` + IgnoredTraits []string `json:"ignored_traits"` // connections NovaV2 *gophercloud.ServiceClient `json:"-"` + PlacementV1 *gophercloud.ServiceClient `json:"-"` OSTypeProber *OSTypeProber `json:"-"` ServerGroupProber *ServerGroupProber `json:"-"` // computed state @@ -56,10 +61,18 @@ func (l *Logic) Init(ctx context.Context, provider *gophercloud.ProviderClient, return err } l.NovaV2.Microversion = "2.61" // to include extra specs in flavors.ListDetail() + + l.PlacementV1, err = openstack.NewPlacementV1(provider, eo) + if err != nil { + return err + } + l.PlacementV1.Microversion = "1.6" // for traits endpoint + cinderV3, err := openstack.NewBlockStorageV3(provider, eo) if err != nil { return err } + glanceV2, err := openstack.NewImageV2(provider, eo) if err != nil { return err @@ -129,19 +142,22 @@ func (l *Logic) BuildServiceInfo(ctx context.Context) (liquid.ServiceInfo, error resources := map[liquid.ResourceName]liquid.ResourceInfo{ "cores": { - Unit: liquid.UnitNone, - HasCapacity: true, - HasQuota: true, + Unit: liquid.UnitNone, + HasCapacity: true, + HasQuota: true, + NeedsResourceDemand: true, }, "instances": { - Unit: liquid.UnitNone, - HasCapacity: true, - HasQuota: true, + Unit: liquid.UnitNone, + HasCapacity: true, + HasQuota: true, + NeedsResourceDemand: true, }, "ram": { - Unit: liquid.UnitMebibytes, - HasCapacity: true, - HasQuota: true, + Unit: liquid.UnitMebibytes, + HasCapacity: true, + HasQuota: true, + NeedsResourceDemand: true, }, "server_groups": { Unit: liquid.UnitNone, @@ -159,9 +175,10 @@ func (l *Logic) BuildServiceInfo(ctx context.Context) (liquid.ServiceInfo, error } if IsSplitFlavor(f) { resources[ResourceNameForFlavor(f.Name)] = liquid.ResourceInfo{ - Unit: liquid.UnitNone, - HasCapacity: true, - HasQuota: true, + Unit: liquid.UnitNone, + HasCapacity: true, + HasQuota: true, + NeedsResourceDemand: true, } } return nil @@ -191,11 +208,6 @@ func (l *Logic) BuildServiceInfo(ctx context.Context) (liquid.ServiceInfo, error }, nil } -// ScanCapacity implements the liquidapi.Logic interface. -func (l *Logic) ScanCapacity(ctx context.Context, req liquid.ServiceCapacityRequest, serviceInfo liquid.ServiceInfo) (liquid.ServiceCapacityReport, error) { - return liquid.ServiceCapacityReport{}, errors.New("TODO") -} - // SetQuota implements the liquidapi.Logic interface. func (l *Logic) SetQuota(ctx context.Context, projectUUID string, req liquid.ServiceQuotaRequest, serviceInfo liquid.ServiceInfo) error { opts := make(novaQuotaUpdateOpts, len(serviceInfo.Resources)) diff --git a/internal/liquids/nova/subresources.go b/internal/liquids/nova/subresources.go index 6bad1be9..6b3b720b 100644 --- a/internal/liquids/nova/subresources.go +++ b/internal/liquids/nova/subresources.go @@ -33,9 +33,9 @@ import ( type FlavorAttributes struct { Name string `json:"name"` VCPUs uint64 `json:"vcpu"` - MemoryMiB uint64 `json:"ram"` - DiskGiB uint64 `json:"disk"` - VideoMemoryMiB *uint64 `json:"video_ram,omitempty"` + MemoryMiB uint64 `json:"ram_mib"` + DiskGiB uint64 `json:"disk_gib"` + VideoMemoryMiB *uint64 `json:"video_ram_mib,omitempty"` HWVersion string `json:"-"` // this is only used for sorting the subresource into the right resource } diff --git a/internal/plugins/capacity_nova.go b/internal/plugins/capacity_nova.go index 393046ad..8754ef90 100644 --- a/internal/plugins/capacity_nova.go +++ b/internal/plugins/capacity_nova.go @@ -288,7 +288,7 @@ func (p *capacityNovaPlugin) Scrape(ctx context.Context, backchannel core.Capaci for _, hv := range hypervisorsByAZ[az] { if hv.Match.Hypervisor.HypervisorHostname == instance.HypervisorHostname { var zero nova.BinpackVector[uint64] - placed := nova.BinpackHypervisors{hv}.PlaceOneInstance(flavor, "USED", coresDemand.OvercommitFactor, zero, bb) + placed := nova.BinpackHypervisors{hv}.PlaceOneInstance(flavor, "USED", coresDemand.OvercommitFactor, zero, bb, true) if !placed { logg.Debug("could not simulate placement of known instance %s on %s", instance.ID, hv.Match.Hypervisor.Description()) } @@ -323,7 +323,7 @@ func (p *capacityNovaPlugin) Scrape(ctx context.Context, backchannel core.Capaci placedUsage := hypervisors.PlacementCountForFlavor(flavor.Name) shadowedUsage := instancesPlacedOnShadowedHypervisors[flavor.Name][az] unplacedUsage := saturatingSub(demandByFlavorName[flavor.Name].PerAZ[az].Usage, placedUsage+shadowedUsage) - if !hypervisors.PlaceSeveralInstances(flavor, "used", coresDemand.OvercommitFactor, blockedCapacity, bb, unplacedUsage) { + if !hypervisors.PlaceSeveralInstances(flavor, "used", coresDemand.OvercommitFactor, blockedCapacity, bb, false, unplacedUsage) { canPlaceFlavor[flavor.Name] = false } } @@ -334,7 +334,7 @@ func (p *capacityNovaPlugin) Scrape(ctx context.Context, backchannel core.Capaci blockedCapacity.LocalGB += instancesDemand.PerAZ[az].UnusedCommitments * maxRootDiskSize logg.Debug("[%s] blockedCapacity in phase 2: %s", az, blockedCapacity.String()) for _, flavor := range splitFlavors { - if !hypervisors.PlaceSeveralInstances(flavor, "committed", coresDemand.OvercommitFactor, blockedCapacity, bb, demandByFlavorName[flavor.Name].PerAZ[az].UnusedCommitments) { + if !hypervisors.PlaceSeveralInstances(flavor, "committed", coresDemand.OvercommitFactor, blockedCapacity, bb, false, demandByFlavorName[flavor.Name].PerAZ[az].UnusedCommitments) { canPlaceFlavor[flavor.Name] = false } } @@ -345,7 +345,7 @@ func (p *capacityNovaPlugin) Scrape(ctx context.Context, backchannel core.Capaci blockedCapacity.LocalGB += instancesDemand.PerAZ[az].PendingCommitments * maxRootDiskSize logg.Debug("[%s] blockedCapacity in phase 3: %s", az, blockedCapacity.String()) for _, flavor := range splitFlavors { - if !hypervisors.PlaceSeveralInstances(flavor, "pending", coresDemand.OvercommitFactor, blockedCapacity, bb, demandByFlavorName[flavor.Name].PerAZ[az].PendingCommitments) { + if !hypervisors.PlaceSeveralInstances(flavor, "pending", coresDemand.OvercommitFactor, blockedCapacity, bb, false, demandByFlavorName[flavor.Name].PerAZ[az].PendingCommitments) { canPlaceFlavor[flavor.Name] = false } } @@ -415,7 +415,7 @@ func (p *capacityNovaPlugin) Scrape(ctx context.Context, backchannel core.Capaci // no flavor left that can be placed -> stop break } else { - if hypervisors.PlaceOneInstance(*bestFlavor, "padding", coresDemand.OvercommitFactor, blockedCapacity, bb) { + if hypervisors.PlaceOneInstance(*bestFlavor, "padding", coresDemand.OvercommitFactor, blockedCapacity, bb, false) { totalPlacedInstances[bestFlavor.Name]++ } else { canPlaceFlavor[bestFlavor.Name] = false @@ -443,7 +443,7 @@ func (p *capacityNovaPlugin) Scrape(ctx context.Context, backchannel core.Capaci for az, hypervisors := range hypervisorsByAZ { var ( azCapacity nova.PartialCapacity - builder nova.PooledSubcapacityBuilder + builder nova.DeprecatedPooledSubcapacityBuilder ) for _, h := range hypervisors { azCapacity.Add(h.Match.PartialCapacity()) @@ -458,9 +458,9 @@ func (p *capacityNovaPlugin) Scrape(ctx context.Context, backchannel core.Capaci } } - capacities[p.PooledCoresResourceName][az] = pointerTo(azCapacity.IntoCapacityData("cores", float64(maxRootDiskSize), builder.CoresSubcapacities)) - capacities[p.PooledInstancesResourceName][az] = pointerTo(azCapacity.IntoCapacityData("instances", float64(maxRootDiskSize), builder.InstancesSubcapacities)) - capacities[p.PooledRAMResourceName][az] = pointerTo(azCapacity.IntoCapacityData("ram", float64(maxRootDiskSize), builder.RAMSubcapacities)) + capacities[p.PooledCoresResourceName][az] = pointerTo(azCapacity.DeprecatedIntoCapacityData("cores", float64(maxRootDiskSize), builder.CoresSubcapacities)) + capacities[p.PooledInstancesResourceName][az] = pointerTo(azCapacity.DeprecatedIntoCapacityData("instances", float64(maxRootDiskSize), builder.InstancesSubcapacities)) + capacities[p.PooledRAMResourceName][az] = pointerTo(azCapacity.DeprecatedIntoCapacityData("ram", float64(maxRootDiskSize), builder.RAMSubcapacities)) for _, flavor := range splitFlavors { count := hypervisors.PlacementCountForFlavor(flavor.Name) capacities[p.PooledCoresResourceName][az].Capacity -= coresDemand.OvercommitFactor.ApplyInReverseTo(count * liquidapi.AtLeastZero(flavor.VCPUs)) @@ -481,7 +481,7 @@ func (p *capacityNovaPlugin) Scrape(ctx context.Context, backchannel core.Capaci for az, hypervisors := range hypervisorsByAZ { // if we could not report subcapacities on pooled resources, report them on // the first flavor in alphabetic order (this is why we just sorted them) - var builder nova.SplitFlavorSubcapacityBuilder + var builder nova.DeprecatedSplitFlavorSubcapacityBuilder if p.WithSubcapacities && p.PooledCoresResourceName == "" && idx == 0 { for _, h := range hypervisors { builder.AddHypervisor(h.Match)