Bring down networking-related costs #55

clezag · 2024-01-23T14:26:50Z

Browsing the AWS cost center I've noted that we spent a unreasonable chunk of change on "EC2-other" charges, which on closer inspection revealed themselves to be almost exclusively networking related things:

Note that this is almost half our current AWS bill and we aren't really doing anything heavy yet.

Looks like it likely was a temporary increase in usage from this image:

@christian-roggia @Luscha do you have an Idea how to mitigate this?
Could this be caused by repeated image pulling during error states (ImagePull policy always)?
Maybe the filebeat/elasticsearch logs during busy error logging?
Or do we have to rethink our networking setup?

christian-roggia · 2024-01-23T21:26:50Z

Almost always when we are looking at spikes in network-related costs one of the following is the root cause:

monitoring
logging
error retries with no exponential back-off
infinite loops
heavy network operations

Image pulling should be fine, especially considering that Kubernetes has internally an exponential back-off in case of failure, and will reuse locally downloaded images unless Always is used as the pull image policy instead of IfNotPresent.

I would however verify that the current setup always tries to keep communication within the VPC rather than going out to the public internet just to go back into the AWS network immediately after.

While investigating please take into account the following:

ingress data flow is usually free
egress is usually expensive if the data goes out of the AWS network
data transfer within the same region is usually quite cheap

https://aws.amazon.com/ec2/pricing/on-demand/

This incident tells me it's probably a good time to set up proper monitoring and alerts. Root cause analysis is significantly easier when you have proper observability of the system and metrics you can work with.

clezag added this to the 1st Service on Production milestone Jan 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bring down networking-related costs #55

Bring down networking-related costs #55

clezag commented Jan 23, 2024

christian-roggia commented Jan 23, 2024

Bring down networking-related costs #55

Bring down networking-related costs #55

Comments

clezag commented Jan 23, 2024

christian-roggia commented Jan 23, 2024