You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Browsing the AWS cost center I've noted that we spent a unreasonable chunk of change on "EC2-other" charges, which on closer inspection revealed themselves to be almost exclusively networking related things:
Note that this is almost half our current AWS bill and we aren't really doing anything heavy yet.
Looks like it likely was a temporary increase in usage from this image:
@christian-roggia@Luscha do you have an Idea how to mitigate this?
Could this be caused by repeated image pulling during error states (ImagePull policy always)?
Maybe the filebeat/elasticsearch logs during busy error logging?
Or do we have to rethink our networking setup?
The text was updated successfully, but these errors were encountered:
Almost always when we are looking at spikes in network-related costs one of the following is the root cause:
monitoring
logging
error retries with no exponential back-off
infinite loops
heavy network operations
Image pulling should be fine, especially considering that Kubernetes has internally an exponential back-off in case of failure, and will reuse locally downloaded images unless Always is used as the pull image policy instead of IfNotPresent.
I would however verify that the current setup always tries to keep communication within the VPC rather than going out to the public internet just to go back into the AWS network immediately after.
While investigating please take into account the following:
ingress data flow is usually free
egress is usually expensive if the data goes out of the AWS network
data transfer within the same region is usually quite cheap
This incident tells me it's probably a good time to set up proper monitoring and alerts. Root cause analysis is significantly easier when you have proper observability of the system and metrics you can work with.
Browsing the AWS cost center I've noted that we spent a unreasonable chunk of change on "EC2-other" charges, which on closer inspection revealed themselves to be almost exclusively networking related things:
Note that this is almost half our current AWS bill and we aren't really doing anything heavy yet.
Looks like it likely was a temporary increase in usage from this image:
data:image/s3,"s3://crabby-images/36169/3616905b90bee3eccfe45d9c72c3700c1c35b061" alt="image"
@christian-roggia @Luscha do you have an Idea how to mitigate this?
Could this be caused by repeated image pulling during error states (ImagePull policy always)?
Maybe the filebeat/elasticsearch logs during busy error logging?
Or do we have to rethink our networking setup?
The text was updated successfully, but these errors were encountered: