Skip to content

Commit

Permalink
Merge branch 'master' into keda-prometheus
Browse files Browse the repository at this point in the history
  • Loading branch information
Phantom-Intruder authored Nov 8, 2024
2 parents 8ec9e6d + 14f8ea1 commit 0af89da
Show file tree
Hide file tree
Showing 10 changed files with 227 additions and 30 deletions.
3 changes: 2 additions & 1 deletion Autoscaler101/autoscaler-lab.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,7 @@ Watch the pods, and you will see that the resource limits are reached, after whi

Now that we have gotten a complete look at the vertical pod autoscaler, let's take a look at the HPA. Create a file nginx-hpa.yml and paste the below contents into it.

```
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
Expand Down Expand Up @@ -177,4 +178,4 @@ You should be able to see the memory limit getting reached, after which the numb

## Conclusion

That sums up the lab on autoscalers. In here, we discussed the two most commonly used in-built autoscalers: HPA and VPA. We also took a hands-on look at how the autoscalers worked. This is just the tip of the iceberg when it comes to scaling, however, and the subject of custom scalers that can scale based on metrics other than memory and CPU is vast. If you are interested in looking at more complicated scaling techniques, you could take a look at the [KEDA section](../Keda101/what-is-keda.md) to get some idea of the keda autoscaler.
That sums up the lab on autoscalers. In here, we discussed the two most commonly used in-built autoscalers: HPA and VPA. We also took a hands-on look at how the autoscalers worked. This is just the tip of the iceberg when it comes to scaling, however, and the subject of custom scalers that can scale based on metrics other than memory and CPU is vast. If you are interested in looking at more complicated scaling techniques, you could take a look at the [KEDA section](../Keda101/what-is-keda.md) to get some idea of the keda autoscaler.
16 changes: 14 additions & 2 deletions EKS101/what-is-eks.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,12 +56,18 @@ and if you can see the 2 nodes, then you are all set.

Now that you have the entire cluster running on AWS, there are some things you may want to tweak to your liking. Firstly is the security group. While eksctl creates a default security group that has all the permissions needed to run your EKS cluster, it's best if you go back in and take another look at it. Firstly, ensure that your inbound rules do not allow 0.0.0.0, which would allow all external IPs to connect to your EKS ports. Instead, only allow IPs that you want to access your cluster through. You can do this by specifying the proper CIDR ranges and their associated ports. On the other hand, with outbound ports, allowing 0.0.0.0 is fine since this allows your cluster to communicate with any resource from outside your network.

The next thing you can look at is the nodegroups. Since you specified `t2.micro` in the above command, your nodegroups will be created with that machine type. You can use the AWS console to add nodegroups with specific tolerations so that only certain pods get scheduled on these nodes. You can read more about taints and tolerations in the [Scheduler101 section](../Scheduler101/Nodes_taints_and_tolerations.md). You can also check the Kubernetes version that is used in your cluster from here. If you followed the above tutorial, you will have a cluster with Kubernetes version 1.24. You can update this version from the console. However, note that a lot of things vary from version to version, and you might end up getting something in your existing application broken if you blindly update your Kubernetes version.
The next thing you can look at is the node groups. Since you specified `t2.micro` in the above command, your nodegroups will be created with that machine type. You can use the AWS console to add node groups with specific tolerations so that only certain pods get scheduled on these nodes. You can read more about taints and tolerations in the [Scheduler101 section](../Scheduler101/Nodes_taints_and_tolerations.md). You can also check the Kubernetes version that is used in your cluster from here. If you follow the above tutorial, you will have a cluster with Kubernetes version 1.24. You can update this version from the console. However, note that a lot of things vary from version to version, and you might end up getting something in your existing application broken if you blindly update your Kubernetes version. However, updating the Kubernetes version is certainly important as AWS ends standard support for older Kubernetes versions (after a generous grace period). After this, the version enters extended support for another year during which support is subject to additional fees.

On the topic of updating, you will also notice an AMI version that is mentioned per each node group. Since you created this cluster recently, you will have the latest AMI version. However, AMIs get updated around twice each month, and while there won't be any major issues if you don't keep your AMIs updated, it is good to update as frequently as possible. Unlike updating the Kubernetes version, AMI updates are relatively safe since they only update the OS to have the latest packages specified by the AWS team. The update can be performed either as a rolling update, or a forced update. A rolling update will create a new node with the new AMI version and move all the pods in the old node to the new node before the old pods are drained and the old node is deleted. A forced update will immediately destroy the old node and start up a new node. The advantage of this method is that it is much faster and will always complete successfully, whereas a rolling update will take much longer and may fail to finish the update if any pods fail to drain.

Another thing to consider is cost tagging. In a large organization, you would have multiple AWS resources that contribute to a large bill that you get at the end of the month. Usually, teams involved in costing would want to know exactly where the costs come from. If you were dealing with a resource such as an EC2 instance, you would not have to look deeply into this as you can just go into the cost explorer, filter by service, and just ask for the cost of the EC2 instances which would give you an exact amount on how much you spend on the resources. However, this becomes much more complicated with the EKS cluster. Not only do you have EC2 instances running in EKS clusters, but you are also paying for the control plane. Additionally, you also pay for EC2 resources such as load balancers and data transfer, along with a host of other things. To fully capture the total cost of your EKS cluster, you must use [cost allocation tags](https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/cost-alloc-tags.html)

First, go to your EKS cluster on the AWS console and add a tag with a value. Next, head over to each of your node groups and add the same tag-value pair to them. You can also use the same tags on any EC2 instances that have been spun up by the node group, but if your cluster scales down and comes back up at a later point, this will create brand new EC2 instances that won't have the tag on them. Therefore it is better to head over to the autocale groups section in your AWS console, select the group that corresponds to your EKS cluster, and add the tags there. Also, make sure you select the option to have the tags automatically added onto any new EC2 instances that get spun up by the ASG.

Next, take a look at the IAM role that is used by the cluster in the overview section. eksctl would have already given you the ideal level of permissions in the IAM role, so there is not much you would want to remove from here. However, if you want to allow your cluster to access any additional items, you should add those permissions at this point. The networking section shows you information about the network your cluster is in, including the IPv4 range, subnets, and security group. You can also manage access to the cluster endpoint from here.

The add-ons section allows you to get add-ons for your EKS cluster from the AWS marketplace, and the observability section is where you would enable CloudWatch container insights to get metrics and reports on your containers. Of course, if you wanted to go beyond what AWS was providing, you could go for tools such as Prometheus that give you better fine-grained control as well as better cross-platform integration. With that, we have covered pretty much every additional thing you can do with your EKS cluster.

## Cleaning up

Now, remember that all of the above things are AWS resources, and as such, you will be charged if you leave them running without deleting them after you are done. So this means you have a bunch of stuff (VPCs, cluster, EC2 instances) that you have to get rid of, which would have been a pain if you had to do it manually. However, since eksctl created all these resources for you, it can also get rid of all these resources for you, in the same manner, using a single command:
Expand Down Expand Up @@ -92,7 +98,13 @@ eksctl create cluster --fargate

One thing to note is that running your containers on Fargate means that you will not have any control over the infrastructure that it runs on since all that is managed by AWS. So if you need the environment the container runs in to be specific, EC2 instances are still your best option, so you might want to start considering Nodegroups.

Your Kubernetes cluster consists of nodes, and nodegroups, as the name implies, groups the nodes together. You can group several nodes into a single group in a way that makes logical sense, and have the nodegroup automatically manage itself. So you will still be using EC2 instances, but the Nodegroup will be creating, provisioning, and deleting the instances as needed. However, some features that Fargate offers such as scaling will no longer be available to you. So we can consider it a good middle group between manageability and flexibility.
## Node groups

Your Kubernetes cluster consists of nodes, and nodegroups, as the name implies, groups the nodes together. You can group several nodes into a single group in a way that makes logical sense and have the node group automatically manage itself. So you will still be using EC2 instances, but the Nodegroup will be creating, provisioning, and deleting the instances as needed. In short, it handles scaling as required by the resources in your cluster. This is especially important if your cluster doesn't have a steady workload throughout the day. For instance, if the amount of resources used in the peak of the day is around 3 or 4 times the number of resources used during off-peak hours, you can create a node group with a minimum of 1 node and a maximum of 4 nodes, which means that depending on load, EKS will automatically scale between the required resources. This helps you save costs without sacrificing performance. However, you will notice that EKS already does all this. By default, you already have a node group up and running, so why would you want multiple groups?

This is where node taints and tolerations come in. You probably know what taints and tolerations are, and how nodes can be created that tolerate certain taints that pods have, thereby allowing them to schedule those pods. The same concept applies here, except now you get to apply tolerations to entire node groups. Once a node group has toleration applied, any nodes that are created from this node group will have the tolerations applied to it. This is a vital part of more complex autoscaling (for example, if you were using an autoscaler like [KEDA](../Keda101/what-is-keda.md)). If you are running multiple KEDA-scaled jobs, you would not want to schedule all of the applications on the same node group. This could lead to resource starvation for some nodes while some other resources will use too many resources. To counter this, you could create a node group per application and use taints and tolerations to make sure that any jobs that start in an application only get allocated to their specified node.

However, some features that Fargate offers such as scaling will no longer be available to you. So we can consider it a good middle group between manageability and flexibility.

As one last thing, before we finish, I would like to point out that another possibility is to have both Fargate and EC2 instances running to work for the same cluster. That is, you can create EC2 instances for the nodes that you need fine-grained control over while allowing Fargate to handle any other infrastructure that just needs to run, no matter how or where.

Expand Down
49 changes: 29 additions & 20 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -8,34 +8,44 @@ PATH
GEM
remote: https://rubygems.org/
specs:
activesupport (7.0.7.2)
activesupport (7.1.3.4)
base64
bigdecimal
concurrent-ruby (~> 1.0, >= 1.0.2)
connection_pool (>= 2.2.5)
drb
i18n (>= 1.6, < 2)
minitest (>= 5.1)
mutex_m
tzinfo (~> 2.0)
addressable (2.8.4)
public_suffix (>= 2.0.2, < 6.0)
ast (2.4.2)
base64 (0.2.0)
bigdecimal (3.1.8)
coffee-script (2.4.1)
coffee-script-source
execjs
coffee-script-source (1.11.1)
colorator (1.1.0)
commonmarker (0.23.10)
concurrent-ruby (1.2.2)
dnsruby (1.70.0)
connection_pool (2.4.1)
dnsruby (1.72.2)
simpleidn (~> 0.2.1)
drb (2.2.1)
em-websocket (0.5.3)
eventmachine (>= 0.12.9)
http_parser.rb (~> 0)
ethon (0.16.0)
ffi (>= 1.15.0)
eventmachine (1.2.7)
execjs (2.8.1)
faraday (2.7.4)
faraday-net_http (>= 2.0, < 3.1)
ruby2_keywords (>= 0.0.4)
faraday-net_http (3.0.2)
execjs (2.9.1)
faraday (2.10.1)
faraday-net_http (>= 2.0, < 3.2)
logger
faraday-net_http (3.1.1)
net-http
ffi (1.15.5)
forwardable-extended (2.6.0)
gemoji (3.0.1)
Expand Down Expand Up @@ -218,15 +228,17 @@ GEM
listen (3.8.0)
rb-fsevent (~> 0.10, >= 0.10.3)
rb-inotify (~> 0.9, >= 0.9.10)
logger (1.6.0)
mercenary (0.3.6)
minima (2.5.1)
jekyll (>= 3.5, < 5.0)
jekyll-feed (~> 0.9)
jekyll-seo-tag (~> 2.1)
minitest (5.19.0)
nokogiri (1.14.3-arm64-darwin)
racc (~> 1.4)
nokogiri (1.14.3-x86_64-linux)
minitest (5.25.1)
mutex_m (0.2.0)
net-http (0.4.1)
uri
nokogiri (1.16.5-x86_64-linux)
racc (~> 1.4)
octokit (4.25.1)
faraday (>= 1, < 3)
Expand All @@ -237,13 +249,14 @@ GEM
pathutil (0.16.2)
forwardable-extended (~> 2.6)
public_suffix (4.0.7)
racc (1.6.2)
racc (1.7.3)
rainbow (3.1.1)
rb-fsevent (0.11.2)
rb-inotify (0.10.1)
ffi (~> 1.0)
regexp_parser (2.8.0)
rexml (3.2.5)
rexml (3.3.6)
strscan
rouge (3.26.0)
rubocop (0.93.1)
parallel (~> 1.10)
Expand All @@ -257,7 +270,6 @@ GEM
rubocop-ast (1.28.1)
parser (>= 3.2.1.0)
ruby-progressbar (1.13.0)
ruby2_keywords (0.0.5)
rubyzip (2.3.2)
safe_yaml (1.0.5)
sass (3.7.4)
Expand All @@ -268,18 +280,16 @@ GEM
sawyer (0.9.2)
addressable (>= 2.3.5)
faraday (>= 0.17.3, < 3)
simpleidn (0.2.1)
unf (~> 0.1.4)
simpleidn (0.2.3)
strscan (3.1.0)
terminal-table (1.8.0)
unicode-display_width (~> 1.1, >= 1.1.1)
typhoeus (1.4.0)
ethon (>= 0.9.0)
tzinfo (2.0.6)
concurrent-ruby (~> 1.0)
unf (0.1.4)
unf_ext
unf_ext (0.0.8.2)
unicode-display_width (1.8.0)
uri (0.13.0)
w3c_validators (1.3.7)
json (>= 1.8)
nokogiri (~> 1.6)
Expand All @@ -288,7 +298,6 @@ GEM
yell (2.2.2)

PLATFORMS
arm64-darwin-22
x86_64-linux

DEPENDENCIES
Expand Down
4 changes: 2 additions & 2 deletions Helm101/chart-hooks.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,9 +18,9 @@ If you want to run 2 (or more) hooks, then there should be a way for you to spec

A full list of possible chart hooks can be found in the Helm [official documentation](https://helm.sh/docs/topics/charts_hooks/#the-available-hooks). In this case, we can consider the process of what happens when a Helm chart is installed. Helm has two possible hooks for use in this case, ```pre-install``` and ```post-install```. Pre-install executes after any templates are rendered and before any resources are loaded into Kubernetes. Post-install runs after everything is in place.

If you were to use a pre-install hook, the normal install process would go on until the templates are rendered. At that point, Helm starts loading your hooks and waits until they are ready before loading resourecs into Kubernetes. Going back to our example where the hook has a pod declared in it, the pod will spin up at this point, run to completion, and then finish.
If you were to use a pre-install hook, the normal install process would go on until the templates are rendered. At that point, Helm starts loading your hooks and waits until they are ready before loading resources into Kubernetes. Going back to our example where the hook has a pod declared in it, the pod will spin up at this point, run to completion, and then finish.

If a post-install hook was also in place, this would come into effect after resoureces have finished loading. The post-install hooks would run and Helm would wait until these hooks are ready before continuing. Something you should note is that Helm expect the process declared in the hook to finish, and will halt everything until that point is reached. If there is an error here, the operation will fail.
If a post-install hook was also in place, this would come into effect after resources have finished loading. The post-install hooks would run and Helm would wait until these hooks are ready before continuing. Something you should note is that Helm expect the process declared in the hook to finish, and will halt everything until that point is reached. If there is an error here, the operation will fail.

## Creating a hook

Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -382,7 +382,7 @@ Add the following entry for local access
url: http://127.0.0.1:4000
```

## Step 2. Run the container
## Step 3. Run the container


```
Expand Down
20 changes: 20 additions & 0 deletions ReplicationController101/ReplicationController.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
apiVersion: v1
kind: ReplicationController
metadata:
name: nginxrc
labels:
app: nginx
spec:
replicas: 2
selector: #optional
team: dev
template: #pod template
metadata:
labels:
team: dev
spec:
containers:
- name: nginxcont
image: nginx
ports:
- containerPort: 80
Loading

0 comments on commit 0af89da

Please sign in to comment.