Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase API size #135

Merged
merged 2 commits into from
Oct 19, 2023
Merged

Increase API size #135

merged 2 commits into from
Oct 19, 2023

Conversation

slifty
Copy link
Contributor

@slifty slifty commented Oct 12, 2023

This PR increase the API machine to double the amount of processor capacity.

Resolves PermanentOrg/sftp-service#268

Copy link
Member

@cecilia-donnelly cecilia-donnelly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will you also do staging? Not a true blocker since I think we'll rebuild the images on main after this is merged anyway, but I don't want to forget.

@slifty
Copy link
Contributor Author

slifty commented Oct 12, 2023

I'm not entirely sure how to re-run terraform tasks from here, but one did fail for staging.

I think maybe that failed due to something else happening in Terraform at the same time; @cecilia-donnelly do you see anything about the change to staging that you would expect to cause an error?

@cecilia-donnelly
Copy link
Member

You know, @slifty, this is a real problem that only shows up on staging! @fenn-cs 's automated test script creates a subnet in Permanent's AWS that conflicts with the subnet search for staging, which is why this doesn't work. We ran into this a few months ago, too.

Last time I deleted the conflicting subnet since Fon wasn't running his tests then and had deleted the relevant instances. We could (1) do that again and bring it back up with the script afterward, (2) do that and also change the script so it uses a non-conflicting subnet, or (3) make the search in infrastructure a bit smarter so the subnets don't conflict anymore. Any of those is fine, though (3) requires a bit more careful testing.

@slifty
Copy link
Contributor Author

slifty commented Oct 13, 2023

@cecilia-donnelly I vote for 2 -- it sounds like a bug in the test scripts that should be resolved.

@fenn-cs is this something you are able to knock out so we can get this merged?

@nfebe
Copy link
Contributor

nfebe commented Oct 13, 2023

@slifty Yeah early next week, but if it's urgent I can check it out this weekend.

@nfebe
Copy link
Contributor

nfebe commented Oct 18, 2023

@slifty So I started by doing 1) above which is deleting the subnets, using terraform destroy from the same environment that created the subnet.

I have looked into how to make it "smart" and found the cidrsubnet which I have played with locally and realized the addressed calculated are still pretty predictable and is not all that "smart".

So,

  1. Either I have not figured out things correctly
  2. It's as it seems, which means I don't really need cidrsubnet and can just change the conflicting IP to another one different from the existing one.

I want us to test this, by you repeating the action, so I can see if the cidrsubnet works as it should when the first address is unavailable before concluding what goes into the PR for the rclone-iac

For now, there is not conflict and the deployment can continue.

Further context

Here is the problematic section:

https://github.com/OpenTechStrategies/permanent-rclone-iac/blob/057d71a8a515d0fb541d75155336a602725d1761/main.tf#L40-L47

resource "aws_vpc" "mainvpc" {
  cidr_block = "10.0.0.0/16"
  enable_dns_support   = true
  enable_dns_hostnames = true
}

resource "aws_subnet" "public" {
  vpc_id = aws_vpc.mainvpc.id
  cidr_block = "10.0.0.0/24"
  availability_zone = "us-west-2a"
  tags = {
    Name = "Default subnet for us-west-2a"
  }
}

Here is what it should look like with cidrsubnet

resource "aws_vpc" "mainvpc" {
  cidr_block = "10.0.0.0/16"
  enable_dns_support   = true
  enable_dns_hostnames = true
}

resource "aws_subnet" "public" {
  vpc_id = aws_vpc.mainvpc.id
  cidr_block = cidrsubnet(aws_vpc.mainvpc.cidr_block, 8, 1) // The last paramameter as "0" leads to the exact same conflicting address
  availability_zone = "us-west-2a"
  tags = {
    Name = "Default subnet for us-west-2a"
  }
}

As part of our work exploring the impact of the sftp service, we
identified the need to increase the perfomance capability of the API
server in order to handle the volume of API calls made by SFTP users
[1].

Amazon provides a range of instance sizes.  This choice increases the
CPU capacity of the machine without expanding the memory capacity.  This
is because our initial benchmarks seemed to be pinning CPU usage but not
putting a particularly large dent in memory.

[1] PermanentOrg/sftp-service#268
@slifty slifty merged commit 7ce5670 into main Oct 19, 2023
3 checks passed
@slifty slifty deleted the noissue-api-size branch October 19, 2023 15:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Basline performance checks
3 participants