Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split and update autostart/autostop doc #1762

Merged
merged 6 commits into from
Aug 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion about/extensions.html.markerb
Original file line number Diff line number Diff line change
Expand Up @@ -52,4 +52,4 @@ To fulfil our promise of a performant application in any Fly.io region, latency

Your Fly.io network is essentially a global, encrypted LAN, with DNS service discovery and load balancing built-in. This greatly simplifies configuration of services that cluster and gossip. Customers access your service via a single private IP address that automatically routes traffic to the nearest provider VM.

We also offer cost-cutting features such as [automatic VM stop/start based on incoming request volume](/docs/launch/autostart-stop/). Contact us at [[email protected]](mailto:[email protected]) to learn more - we're happy to help you get started!
We also offer cost-cutting features such as [automatic VM stop/start based on incoming request volume](/docs/launch/autostop-autostart/). Contact us at [[email protected]](mailto:[email protected]) to learn more - we're happy to help you get started!
2 changes: 1 addition & 1 deletion apps/app-availability.html.markerb
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ Default settings for new apps created using the `fly launch` command: automatica

Default settings for some existing apps (or any apps that don't have these settings in `fly.toml`): automatically start but don't automatically stop Fly Machines.

Get all the details about [automatically stopping and starting Machines](/docs/launch/autostart-stop/).
Learn more about [how Fly Proxy autostop/autostart works](/docs/reference/fly-proxy-autostop-autostart/) and [how to configure it](/docs/launch/autostop-autostart/).

## Health check-based routing

Expand Down
8 changes: 4 additions & 4 deletions apps/concurrency.html.markerb
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ nav: apps
redirect_from: /docs/reference/concurrency/
---

Concurrency settings are used by the Fly Proxy for important things like [load balancing](/docs/reference/load-balancing/#load-balancing-strategy) and [autostart/autostop](/docs/launch/autostart-stop/) for Machines.
Concurrency settings are used by Fly Proxy for important things like [load balancing](/docs/reference/load-balancing/#load-balancing-strategy) and [autostop/autostart](/docs/launch/autostop-autostart/) for Machines.

The following concurrency settings apply per Machine and per service in your app:

Expand All @@ -23,9 +23,9 @@ The default concurrency settings when not specified are a `soft_limit` of 20 and

`connections` is the default concurrency type.

## How the Fly Proxy handles concurrency
## How Fly Proxy handles concurrency

The Fly Proxy doesn't consider the exact number of concurrent connections or requests. From the proxy's point of view a Machine is handling between 0 and the soft limit, handling between the soft limit and the hard limit, or is at the hard limit. Along with region, this is how the proxy decides which Machine gets traffic.
Fly Proxy doesn't consider the exact number of concurrent connections or requests. From the proxy's point of view a Machine is handling between 0 and the soft limit, handling between the soft limit and the hard limit, or is at the hard limit. Along with region, this is how the proxy decides which Machine gets traffic.

The proxy's view of a Machine's load is affected when the gap between the soft limit and the hard limit is small and/or the app's concurrent connections or requests oscillate between them more frequently. In that case, the proxy routes according to the settings, but the thresholds change too quickly and requests or connections end up being re-routed.

Expand All @@ -45,4 +45,4 @@ The decision to use `connections` or `requests` for concurrency depends on the t

## Concurrency limit tuning tips

When tuning concurrency, try setting a relatively high `hard_limit`, or leave it unset to have no hard limit. If you do want to set a `hard_limit` to have more control over load balancing, then you might have to do an initial benchmark to estimate the maximum number of concurrent connections or requests that your app can handle. Then tune the `soft_limit` and [create more Machines](/docs/blueprints/resilient-apps-multiple-machines/) to optimize autostart/autostop and load balancing. Once your app is getting real-world traffic, you can continue to monitor your app and adjust the `soft_limit` further to suit your workload.
When tuning concurrency, try setting a relatively high `hard_limit`, or leave it unset to have no hard limit. If you do want to set a `hard_limit` to have more control over load balancing, then you might have to do an initial benchmark to estimate the maximum number of concurrent connections or requests that your app can handle. Then tune the `soft_limit` and [create more Machines](/docs/blueprints/resilient-apps-multiple-machines/) to optimize autostop/autostart and load balancing. Once your app is getting real-world traffic, you can continue to monitor your app and adjust the `soft_limit` further to suit your workload.
8 changes: 4 additions & 4 deletions apps/fine-tune-apps.html.markerb
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Tips to fine-tune and (not) benchmark your app on Fly.io
title: Tips to fine-tune your app on Fly.io
layout: docs
nav: apps
redirect_from: /docs/reference/fine-tune-apps/
Expand All @@ -11,7 +11,7 @@ That settled, you can still use the following guidelines to gather data to fine-

## Fly.io metrics

We have various ways to access metrics for the Fly Proxy, your app, and Fly Machines. Learn more about [Fly Metrics](/docs/reference/metrics/).
We have various ways to access metrics for Fly Proxy, your app, and Fly Machines. Learn more about [Fly Metrics](/docs/reference/metrics/).

- Edge HTTP response time is the total time it takes to respond with the first HTTP response headers.
- App HTTP response time is the same, but only considers your app’s response time, not the overhead of routing, load balancing, retries, and so on.
Expand Down Expand Up @@ -76,9 +76,9 @@ Refer to our guidelines for [concurrency settings](/docs/reference/concurrency/)

### Auto stop and start feature

[Auto stop and start](/docs/launch/autostart-stop/) can be turned off if you want to test the performance of your app without that variable.
[Auto stop and start](/docs/launch/autostop-autostart/) can be turned off if you want to test the performance of your app without that variable.

If you don’t turn off auto stop and start, then you’ll likely have some pretty high p99 values due to the initial cost of the Fly Proxy queueing connections while waiting for the Machine to start.
If you don’t turn off auto stop and start, then you’ll likely have some pretty high p99 values due to the initial cost of Fly Proxy queueing connections while waiting for the Machine to start.

## Distributed systems

Expand Down
2 changes: 1 addition & 1 deletion apps/going-to-production.html.markerb
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ This document lists important topics to consider when you set up a production en

- **Use multiple Machines for resiliency:** Make your app resilient to single-host failures with multiple Machines that stay stopped until you need them. See [Blueprint: Resilient apps use multiple Machines](/docs/blueprints/resilient-apps-multiple-machines/).

- **Set up autoscaling by load or metric:** Use Fly Proxy autostart/autostop or the metrics-based autoscaler app. See [Autoscaling](/docs/reference/autoscaling/).
- **Set up autoscaling by load or metric:** Use Fly Proxy autostop/autostart or the metrics-based autoscaler app. See [Autoscaling](/docs/reference/autoscaling/).

## CI/CD

Expand Down
8 changes: 4 additions & 4 deletions blueprints/autoscale-machines.html.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ Machines during the period of low traffic.

This blueprint will guide you through the process of configuring the
[`fly-autoscaler` app](/docs/launch/autoscale-by-metric/) in conjunction with
the [Fly Proxy autostart/autostop](/docs/launch/autostart-stop/) feature to
[Fly Proxy autostop/autostart](/docs/launch/autostop-autostart/) to
always keep a fixed number of stopped Machines ready to be quickly started
by Fly Proxy.

Expand Down Expand Up @@ -103,8 +103,8 @@ $ fly deploy --ha=false

## Read more

- [Autoscale based on metrics](https://fly.io/docs/launch/autoscale-by-metric/)
- [Autoscale based on metrics](/docs/launch/autoscale-by-metric/)

- [Automatically stop and start Machines](https://fly.io/docs/launch/autostart-stop/)
- [Autostop/autostart Machines](/docs/launch/autostop-autostart/)

- [Autostart and autostop private apps](https://fly.io/docs/blueprints/autostart-internal-apps/)
- [Autostop/autostart private apps](/docs/blueprints/autostart-internal-apps/)
4 changes: 2 additions & 2 deletions blueprints/autostart-internal-apps.html.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ nav: firecracker

You have a private, or internal, app that communicates only with other apps on your [private network](/docs/networking/private-networking/). This private app might be a database, authentication server, or any other "backend" app that you don't want exposed to the public Internet. You want the app's Machines to stop when they're not serving requests from your other apps, and start again automatically when needed.

To use the Fly Proxy autostart and autostop feature you need to configure services in `fly.toml`, like you would for a public app. But instead of using a public Anycast address, you assign a Flycast address to expose those services only on your private network.
To use Fly Proxy autostop/autostart you need to configure services in `fly.toml`, like you would for a public app. But instead of using a public Anycast address, you assign a Flycast address to expose those services only on your private network.

This blueprint focuses on using autostart and autostop to control Machines based on incoming requests. But when you use Flycast for private apps you also get other Fly Proxy features like geographically aware load balancing.

Expand Down Expand Up @@ -97,7 +97,7 @@ Here's an example `fly.toml` snippet:
**Important:** Set `force_https = false` since Flycast only works over HTTP. HTTPS isn't necessary because all your private network traffic goes through encrypted WireGuard tunnels.
</div>

Learn more about [Fly Launch configuration](/docs/reference/configuration/) and the [autostart and autostop](/docs/apps/autostart-stop/) feature.
Learn more about [Fly Launch configuration](/docs/reference/configuration/) and [Fly Proxy autostop/autostart](/docs/launch/autostop-autostart/).

### Make sure your app binds to `0.0.0.0:<port>`

Expand Down
4 changes: 2 additions & 2 deletions blueprints/multi-region-fly-replay.html.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ When your app receives write requests, you can use the `fly-replay` response hea

## How it works

The [`fly-replay` response header](/docs/networking/dynamic-request-routing/) instructs the Fly proxy to redeliver (replay) the original request to another region or Machine in your app, or even another app in your organization. In this case, you’ll be replaying write requests to the Machine in the primary region. Using `fly-replay` to replay write requests is a general pattern that can be applied in most languages and frameworks for databases with one primary and multiple read replicas.
The [`fly-replay` response header](/docs/networking/dynamic-request-routing/) instructs Fly proxy to redeliver (replay) the original request to another region or Machine in your app, or even another app in your organization. In this case, you’ll be replaying write requests to the Machine in the primary region. Using `fly-replay` to replay write requests is a general pattern that can be applied in most languages and frameworks for databases with one primary and multiple read replicas.

In the following diagram, the app is running one Machine in each of three regions. The primary region is Chicago, and this is where the read/write primary database resides. There are Machines in two other regions, Rio de Janeiro and Amsterdam, each of which has a read replica. This example uses three regions for simplicity, but you could deploy in more than three regions and have more than one Machine per region connecting to the same read replica.

Expand Down Expand Up @@ -51,7 +51,7 @@ Your app can check the `FLY_REGION` against the `PRIMARY_REGION`, and modify the

### Replay write requests to the primary region

Your app can detect write requests and send a response with the `fly-replay` header that tells the Fly Proxy to replay the whole request to the Fly Machine in the primary region.
Your app can detect write requests and send a response with the `fly-replay` header that tells Fly Proxy to replay the whole request to the Fly Machine in the primary region.

#### Detect write requests

Expand Down
16 changes: 8 additions & 8 deletions blueprints/resilient-apps-multiple-machines.html.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,27 +6,27 @@ nav: firecracker

Fly Machines are fast-launching VMs; they're the compute of the Fly.io platform. Every Machine runs on a single physical host. If that host fails, the Machine becomes unavailable; it does not automatically get rescheduled on another host.

To make your app resilient to single-host failure, create at least two Machines per app or process. The Fly Proxy autostart/autostop feature and standby Machines are built-in platform features that you can use to start extra Machines only when needed.
To make your app resilient to single-host failure, create at least two Machines per app or process. Fly Proxy autostop/autostart and standby Machines are built-in platform features that you can use to start extra Machines only when needed.

## Multiple Machines for apps with services

You can add more Machines for Fly Proxy to start and stop as needed, which is great for apps that have built-in replication or that don't share data.

### You get two Machines on first deploy

When you deploy an app for the first time with the `fly launch` or `fly deploy` command, you automatically get two identical Machines for processes that have HTTP/TCP services configured in `fly.toml`. The Machines have autostart/autostop enabled so that Fly Proxy can start and stop them based on traffic to your app. You'll also get this default redundant Machine when you `fly deploy` after scaling to zero.
When you deploy an app for the first time with the `fly launch` or `fly deploy` command, you automatically get two identical Machines for processes that have HTTP/TCP services configured in `fly.toml`. The Machines have autostop/autostart enabled so that Fly Proxy can start and stop them based on traffic to your app. You'll also get this default redundant Machine when you `fly deploy` after scaling to zero.

<div class="important icon">
**Volumes:** You'll only get one Machine with `fly launch` for processes or apps with volumes mounted. Volumes don't automatically replicate your data for you, so you'll need to set that up before intentionally creating more Machines with volumes.
</div>

### Add more Machines yourself

If your app doesn't already have multiple Machines with autostart/autostop, then you can set it up yourself. You can create any number of Machines to both meet user demand and provide redundancy against host failures.
If your app doesn't already have multiple Machines with autostop/autostart configured, then you can set it up yourself. You can create any number of Machines to both meet user demand and provide redundancy against host failures.

#### 1. Set up autostart/autostop
#### 1. Set up autostop/autostart

Use autostart/autostop to tell the Fly Proxy to start and stop Machines based on traffic. Keep one or more Machines running in your primary region if you want to. Example from `fly.toml` config:
Use [Fly Proxy autostop/autostart](/docs/launch/autostop-autostart/#apps-that-shut-down-when-idle) to automatically stop and start Machines based on traffic. Keep one or more Machines running in your primary region if you want to. Example from `fly.toml` config:

```toml
[http_service]
Expand All @@ -40,9 +40,9 @@ Use autostart/autostop to tell the Fly Proxy to start and stop Machines based on
soft_limit = 200 # Used by Fly Proxy to determine Machine excess capacity
```

Fly Proxy uses the concurrency `soft_limit` to determine if Machines have capacity. Learn more about [autostart/autostop](/docs/launch/autostart-stop/).
Fly Proxy uses the concurrency `soft_limit` to determine if Machines have capacity. Learn more about [how Fly Proxy autostop/autostart works](/docs/reference/fly-proxy-autostop-autostart/).

**Using the Machines API:** To add or change the autostart/autostop settings with the Machines API, use the settings in the `services` object of the [Machine config](/docs/machines/api/machines-resource/#machine-config-object-properties) in your create or update calls.
**Using the Machines API:** To add or change the autostop/autostart settings with the Machines API, use the settings in the `services` object of the [Machine config](/docs/machines/api/machines-resource/#machine-config-object-properties) in your create or update calls.

#### 2. Create more Machines

Expand All @@ -66,7 +66,7 @@ Learn more about [scaling the number of Machines](/docs/apps/scale-count/).

When apps or processes are running tools like cron that don't require local storage or accept external requests, it's common to run only one Machine. Since these Machines don't have services configured, they can't be automatically started and stopped by the Fly Proxy. To add redundancy against host failures for this kind of Machine, use a standby Machine; it stays stopped and ready to take over in case the original Machine becomes unavailable.

Unlike the autostart/autostop feature, which starts Machines based on app traffic, a standby Machine watches the Machine it's paired to and starts only if that Machine becomes unavailable. Learn more about [standby Machines](https://fly.io/docs/reference/app-availability/#standby-machines-for-process-groups-without-services).
Unlike Fly Proxy autostop/autostart, which starts Machines based on app traffic, a standby Machine watches the Machine it's paired to and starts only if that Machine becomes unavailable. Learn more about [standby Machines](https://fly.io/docs/reference/app-availability/#standby-machines-for-process-groups-without-services).

### You get a standby Machine on first deploy

Expand Down
2 changes: 1 addition & 1 deletion gpus/getting-started-gpus.html.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ Machine learning tends to involve large quantities of data. We're working with a
- The root file system of a Fly Machine is ephemeral -- it's reset from its Docker image on every restart. It's also limited to 50GB on GPU-enabled Machines.
- Fly Volumes are limited to 500GB, and are attached to a physical server. The Machine must run on the same hardware as the volume it mounts.

Unless you've got a constant workload, you'll likely want to shut down GPU Machines when they're not needed&mdash;you can do this manually with `fly machine stop`, [have the main process exit](/docs/launch/autostart-stop/#stop-a-machine-by-terminating-its-main-process) when idle, or use the Fly Proxy [autostop and autostart](/docs/launch/autostart-stop/#how-it-works) features&mdash;to save money. Saving data on a persistent Fly Volume means you don't have to download large amounts of data, or reconstitute a large Docker image into a rootfs, whenever the Machine restarts. You'll probably want to store models, at least, on a volume.
Unless you've got a constant workload, you'll likely want to shut down GPU Machines when they're not needed&mdash;you can do this manually with `fly machine stop`, [have the main process exit](/docs/launch/autostop-autostart/#apps-that-shut-down-when-idle) when idle, or use the Fly Proxy [autostop and autostart](/docs/launch/autostop-autostart/) features&mdash;to save money. Saving data on a persistent Fly Volume means you don't have to download large amounts of data, or reconstitute a large Docker image into a rootfs, whenever the Machine restarts. You'll probably want to store models, at least, on a volume.

## Using swap

Expand Down
Loading
Loading