Skip to content

Commit

Permalink
Pages reviewed and updated where appropriate.
Browse files Browse the repository at this point in the history
  • Loading branch information
Stephen James committed Oct 20, 2023
1 parent 99169bd commit 66a6248
Show file tree
Hide file tree
Showing 22 changed files with 178 additions and 171 deletions.
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
owner_slack: "#nvvs-devops"
title: 000 - Record architecture decisions
last_reviewed_on: 2023-10-05
last_reviewed_on: 2023-10-20
review_in: 3 months
---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
owner_slack: "#nvvs-devops"
title: 001 - Use BIND DNS for device name resolution
last_reviewed_on: 2023-04-11
last_reviewed_on: 2023-10-20
review_in: 6 months
---

Expand All @@ -17,7 +17,7 @@ Staff devices e.g. laptops and desktops connected to our network will need [DNS]

There is a requirement that this service is able to automatically scale (both up and down) to cope with varying load levels during the course of the day.

There is a limitation around using the fully managed AWS Route53 DNS service as it does not support DNS forwarding.
There is a limitation around using the fully managed AWS Route53 DNS service as it does not support DNS forwarding.

**Dec 2021 Update** Route53 can now forward DNS requests e.g. [PDNS](https://www.ncsc.gov.uk/information/pdns)

Expand Down
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
---
owner_slack: "#nvvs-devops"
title: 002 - Use Cloud Platform to host DHCP and DNS
last_reviewed_on: 2023-04-11
last_reviewed_on: 2023-10-20
review_in: 6 months
---

# 002 - Use Cloud Platform to host DHCP and DNS
Date: 2020-05-22

## Status
## Status
❌ Rejected

## Context
Expand All @@ -21,7 +21,7 @@ After [investigations](https://github.com/ministryofjustice/cloud-platform/issue

**Update 6th January 2021**

The Cloud Platform `live` cluster is now running on Kubernetes 1.20 which should allow TCP and UDP on the network load balancer
The Cloud Platform `live` cluster is now running on Kubernetes 1.20 which should allow TCP and UDP on the network load balancer

([see issue here](https://github.com/ministryofjustice/cloud-platform/issues/1897#issuecomment-1006539120))

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
owner_slack: "#nvvs-devops"
title: 003 - Use AWS Elastic Container Service for DHCP DNS
last_reviewed_on: 2023-04-11
last_reviewed_on: 2023-10-20
review_in: 6 months
---

Expand Down Expand Up @@ -29,4 +29,4 @@ Less administrative overhead than running virtual machines e.g. EC2 and less com

### Disadvantages

Still need to provision the service, require CI/CD tooling, operational documentation and forever maintaining those things.
Still need to provision the service, require CI/CD tooling, operational documentation and forever maintaining those things.
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
owner_slack: "#nvvs-devops"
title: 004 - Use AWS CodePipelines for CI/CD
last_reviewed_on: 2023-04-11
last_reviewed_on: 2023-10-20
review_in: 6 months
---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
owner_slack: "#nvvs-devops"
title: 005 - Use Log Aggregation Platform
last_reviewed_on: 2023-04-11
last_reviewed_on: 2023-10-20
review_in: 6 months
---

Expand Down Expand Up @@ -74,9 +74,9 @@ The Operational Security Logging Platform is ready to accept these logs and the

### Advantages

- We don't need to stand up our own logging infrastructure
- We don't need to stand up our own logging infrastructure
- Availability of logs from different sources in one location.

### Disadvantages

- Reliant on another team which means we may need to wait sometime before we get an aggregated view of our logs.
- Reliant on another team which means we may need to wait sometime before we get an aggregated view of our logs.
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
owner_slack: "#nvvs-devops"
title: 006 - Use AWS Parameter Store for Secrets
last_reviewed_on: 2023-04-11
last_reviewed_on: 2023-10-20
review_in: 6 months
---

Expand All @@ -17,14 +17,14 @@ There is a need to store infrastructure secrets securely in the [PTTP](https://m

## Decision

Use AWS SSM Parameter Store.
Use AWS SSM Parameter Store.
- Aligned with [MoJ Security Guidance](https://security-guidance.service.justice.gov.uk/secrets-management/#application--infrastructure-secrets)
- Compatible with AWS services e.g. [CodePipelines](https://docs.aws.amazon.com/codebuild/latest/userguide/build-spec-ref.html#build-spec-ref-example)
- The use of AWS Secrets Manager can easily be extended if required.

### Alternative Considerations:
### Alternative Considerations:
#### AWS Secrets Manager
AWS Secrets Manager has ability to automatically rotate secrets for AWS RDS access. AWS Secrets Manager has a higher cost than AWS SSM Parameter Store.

#### HashiCorp Vault
HashiCorp Vault is an open-source secret management solution. In order to use it we would have to host and manage an instance of the service ourselves. The cost of hosting, as well as the time to ensure data has appropriate backups, gives this service a high maintenance cost and overhead.
HashiCorp Vault is an open-source secret management solution. In order to use it we would have to host and manage an instance of the service ourselves. The cost of hosting, as well as the time to ensure data has appropriate backups, gives this service a high maintenance cost and overhead.
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
owner_slack: "#nvvs-devops"
title: 007 - Use Prometheus and Grafana for metrics and alerting
last_reviewed_on: 2023-04-11
last_reviewed_on: 2023-10-20
review_in: 6 months
---

Expand Down Expand Up @@ -33,4 +33,4 @@ Use [Prometheus](https://prometheus.io/) for metrics and [Grafana](https://grafa
- Prometheus [Exporters](https://prometheus.io/docs/instrumenting/exporters/) allow collection of metrics from network devices using [SNMP](https://github.com/prometheus/snmp_exporter), as well as the many [native](https://prometheus.io/docs/instrumenting/exporters/#software-exposing-prometheus-metrics) applications
- Grafana to visualise a [wide variety](https://grafana.com/docs/grafana/latest/datasources/) of sources.
- Grafana can send notifications when a custom metric thresholds. Can be easily integrated into Slack (when availble ServiceNow)
- Can be deployed into our existing CI/CD pipelines used for DHCP/DNS.
- Can be deployed into our existing CI/CD pipelines used for DHCP/DNS.
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
owner_slack: "#nvvs-devops"
title: 008 - Use AWS Elastic Container Registry
last_reviewed_on: 2023-04-11
last_reviewed_on: 2023-10-20
review_in: 6 months
---

Expand All @@ -22,4 +22,4 @@ Created issue [here](https://github.com/ministryofjustice/nvvs-devops/issues/96)
## Decision

We will use AWS Elastic Container Registry to store our images.
- It integrates with CodePipelines and existing workflows and will remove the limits we have been hitting..
- It integrates with CodePipelines and existing workflows and will remove the limits we have been hitting..
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
owner_slack: "#nvvs-devops"
title: 009 - Use AWS SSO for AWS Account Access
last_reviewed_on: 2023-04-11
last_reviewed_on: 2023-10-20
review_in: 6 months
---

Expand All @@ -14,13 +14,13 @@ Date: 2021-05-01

## Context

We need to use Single Sign On to access all our AWS accounts.
We need to use Single Sign On to access all our AWS accounts.
We currently use AzureAD for securing access to many of our services.

## Decision

We will use the [Modernisation Platforms](https://github.com/ministryofjustice/modernisation-platform) implementation of [AWS Single Sign On](https://user-guide.modernisation-platform.service.justice.gov.uk/concepts/environments/single-sign-on.html#single-sign-on). It is being used by many teams already so means less development time forour growing team. It does require the use of a MoJ Org GitHub account, but that requirement only further facilitates using [infrastructure as code](https://en.wikipedia.org/wiki/Infrastructure_as_code) within our AWS accounts.

### Alternative Considerations:
### Alternative Considerations:
#### AzureAD
AzureAD is currently managed externally, this means that automating user and groups is not possible which limits its potential.
AzureAD is currently managed externally, this means that automating user and groups is not possible which limits its potential.
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
owner_slack: "#nvvs-devops"
title: 010 - Use AWS EKS for monitoring infrastructure
last_reviewed_on: 2023-04-11
last_reviewed_on: 2023-10-20
review_in: 6 months
---

Expand All @@ -13,20 +13,20 @@ Date: 2021-03-22

## Context

The infrastructure monitoring and alerting platform consists of several services deployed as docker containers. So far these containers have been running on ECS via Fargate, chosen because of the relative ease with which it allows us to get instances provisioned.
The infrastructure monitoring and alerting platform consists of several services deployed as docker containers. So far these containers have been running on ECS via Fargate, chosen because of the relative ease with which it allows us to get instances provisioned.

As the solution has grown, and the interactions between new services have become more complex, we have found that we are running up against Fargate's limitations and require finer-grained control over our deployments.
As the solution has grown, and the interactions between new services have become more complex, we have found that we are running up against Fargate's limitations and require finer-grained control over our deployments.

Kubernetes is the industry standard platform for orchestrating and running container based workloads and provides considerably more flexibility in comparison to ECS and Fargate.


## Decision

Starting with Prometheus and Thanos, we are migrating our services over to AWS's managed Kubernetes offering - [Amazon Elastic Kubernetes Service (EKS)](https://aws.amazon.com/eks/).

## Consequences

While it has the potential to be more complicated due to its increased flexibility, we believe that in the long run, Kubernetes will simplify the operation, maintenance, and improvement of the IMA platform.
While it has the potential to be more complicated due to its increased flexibility, we believe that in the long run, Kubernetes will simplify the operation, maintenance, and improvement of the IMA platform.
It offers several advantages over Fargate:

- Better networking support out of the box enabling:
Expand All @@ -36,6 +36,6 @@ It offers several advantages over Fargate:
- faster development cycle
- Simpler and clearer configuration
- Less reliance on specific infrastructure (could conceivably run on any Kubernetes cluster, regardless of the provider)
- Reduced overall costs as the team can share the same development Kubernetes cluster
- Reduced overall costs as the team can share the same development Kubernetes cluster
- More aligned with common DevOps approaches in wider industry
- The infrastructure will be ready to migrate to another hosting platform like Cloud Platform in the future. ([see issue here](https://github.com/ministryofjustice/cloud-platform/issues/3454))
- The infrastructure will be ready to migrate to another hosting platform like Cloud Platform in the future. ([see issue here](https://github.com/ministryofjustice/cloud-platform/issues/3454))
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
owner_slack: "#nvvs-devops"
title: 011 - Use GitHub Actions for CI/CD
last_reviewed_on: 2023-04-11
last_reviewed_on: 2023-10-20
review_in: 6 months
---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
owner_slack: "#nvvs-devops"
title: 012 - Use Tech-Docs for ADRs
last_reviewed_on: 2023-04-11
last_reviewed_on: 2023-10-20
review_in: 6 months
---

Expand All @@ -12,12 +12,12 @@ Date: 2021-03-22
✅ Accepted

## Context
We want to make sure our architectural design records are reviewed reguarly.
We want to make sure our architectural design records are reviewed reguarly.
If we move our ADRs to tech-docs we can take advantage of [Tech Docs Monitor](https://github.com/ministryofjustice/tech-docs-monitor)

## Decision
It has been decided that we use [TechDocs](https://github.com/ministryofjustice/template-documentation-site#readme) for ADRs.

### Advantages

- We will use [Tech Docs Monitor](https://github.com/ministryofjustice/tech-docs-monitor) to remind team to review ADRs and promote knowledge transfer and discussion.
- We will use [Tech Docs Monitor](https://github.com/ministryofjustice/tech-docs-monitor) to remind team to review ADRs and promote knowledge transfer and discussion.
2 changes: 1 addition & 1 deletion source/documentation/adrs/adr-index.html.md.erb
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
owner_slack: "#nvvs-devops"
title: Architecture Decision Records index
last_reviewed_on: 2023-04-11
last_reviewed_on: 2023-10-20
review_in: 6 months
---

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
owner_slack: "#nvvs-devops"
title: Our Alliance
last_reviewed_on: 2023-06-21
last_reviewed_on: 2023-10-20
review_in: 3 months
---

Expand Down
11 changes: 5 additions & 6 deletions source/documentation/products/backups.html.md.erb
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
owner_slack: "#nvvs-devops"
title: Backups
last_reviewed_on: 2023-04-11
last_reviewed_on: 2023-10-20
review_in: 6 months
---

Expand All @@ -15,16 +15,15 @@ Product | Type of backup | Retention
---|---|---|
DHCP/DNS Admin Portal | RDS (MySQL) | 30 Days
DHCP KEA | RDS (MySQL) | 30 Days |
IMA | RDS (Postgres) | 7 Days |
NACS Admin | RDS (MySQL) | 30 Days |
IMA | RDS (Postgres) | 7 Days |
NACS Admin | RDS (MySQL) | 30 Days |

## Backup configuration
## Backup configuration

The backup retention is defined as code in the following locations.

Product | variable | database setting
---|---|---|
DHCP/DNS Admin Portal | https://github.com/ministryofjustice/staff-device-dns-dhcp-infrastructure/blob/main/variables.tf#L52) | https://github.com/ministryofjustice/staff-device-dns-dhcp-infrastructure/blob/main/modules/admin/db.tf#L18 |
DHCP KEA | https://github.com/ministryofjustice/staff-device-dns-dhcp-infrastructure/blob/main/variables.tf | https://github.com/ministryofjustice/staff-device-dns-dhcp-infrastructure/blob/main/modules/dhcp/mysql.tf#L10 |
NACS Admin | https://github.com/ministryofjustice/network-access-control-infrastructure/blob/main/variables.tf#L48 | https://github.com/ministryofjustice/network-access-control-infrastructure/blob/main/modules/admin/db.tf#L15 |

NACS Admin | https://github.com/ministryofjustice/network-access-control-infrastructure/blob/main/variables.tf#L48 | https://github.com/ministryofjustice/network-access-control-infrastructure/blob/main/modules/admin/db.tf#L15 |
14 changes: 6 additions & 8 deletions source/documentation/products/dhcp.html.md.erb
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: DHCP Overview
last_reviewed_on: 2023-04-11
last_reviewed_on: 2023-10-20
review_in: 3 months
---

Expand All @@ -18,22 +18,22 @@ Allows Public internet connectivity for prison staff and enables modern devices.

Enable onsite support staff to manage local devices e.g. [DHCP reservation](https://kb.isc.org/docs/what-are-host-reservations-how-to-use-them) using GOV.UK Design System styles and patterns.

[Use cloud first](https://www.gov.uk/guidance/use-cloud-first) To meet point 5 of the [Technology Code of Practice](https://www.gov.uk/guidance/the-technology-code-of-practice) (TCoP) and the government’s cloud first policy.
[Use cloud first](https://www.gov.uk/guidance/use-cloud-first) To meet point 5 of the [Technology Code of Practice](https://www.gov.uk/guidance/the-technology-code-of-practice) (TCoP) and the government’s cloud first policy.

[Infrastructure as Code](https://en.wikipedia.org/wiki/Infrastructure_as_code) provides a complete audit of changes, versioning of cloud infrastructure and DNS server application, automated testing and redeployment of the service in the event of disaster.
[Infrastructure as Code](https://en.wikipedia.org/wiki/Infrastructure_as_code) provides a complete audit of changes, versioning of cloud infrastructure and DNS server application, automated testing and redeployment of the service in the event of disaster.

## Tools

The DHCP service uses [ISC KEA](https://www.isc.org/kea/) containers running on [AWS ECS Fargate](https://docs.aws.amazon.com/AmazonECS/latest/userguide/what-is-fargate.html).
We use [Terraform](https://www.terraform.io/intro) and [Infrastructure as Code](https://en.wikipedia.org/wiki/Infrastructure_as_code) to provide a complete audit of changes, versioning of components and the DNS server application, automated testing and redeployment of the service in the event of disaster.
We use [Terraform](https://www.terraform.io/intro) and [Infrastructure as Code](https://en.wikipedia.org/wiki/Infrastructure_as_code) to provide a complete audit of changes, versioning of components and the DNS server application, automated testing and redeployment of the service in the event of disaster.

## Diagram
![High level diagram](../../images/dhcp-hld-diagram.jpeg)
[diagram source](https://github.com/ministryofjustice/staff-device-dhcp-server#high-level-diagram)

## Repositories

| Repository | Description |
| Repository | Description |
| --- | --- |
| [DHCP admin portal](https://github.com/ministryofjustice/staff-device-dns-dhcp-admin#readme) | Admin Portal for managing staff device DNS forwarders and zone configuration. |
| [DHCP server](https://github.com/ministryofjustice/staff-device-dhcp-server#readme) | This repository contains the Dockerfile to create the ISC DHCP server Docker image. The configuration for this server is managed in the Admin Portal. |
Expand All @@ -43,10 +43,8 @@ We use [Terraform](https://www.terraform.io/intro) and [Infrastructure as Code](

## Useful links

| Link | Description |
| Link | Description |
| --- | --- |
| [DHCP admin portal](https://dhcp-dns-admin.staff.service.justice.gov.uk/dhcp) | Admin Portal for managing staff device DNS forwarders and zone configuration. *Please not you need to be a member of the AzureAD group `MoJO-EntApp-DNSDHCP_Viewer` to view and `MoJO-EntApp-DNSDHCP_Editor` to edit.* |
| [Monitoring and alerting guide](product-monitoring-alerting.html) | List Grafana dashboards for health of the products and slack channels in use for alerts. |
| [Transit gateway ](https://github.com/ministryofjustice/deployment-tgw) | Connects the service to wider MoJ networks as a virtual WAN |


Loading

0 comments on commit 66a6248

Please sign in to comment.