Skip to content

Commit dd4bd25

Browse files
committed
Sync open source content 🐝 (from 10787ea1995f682e6036a30eee53ceeeb0ce3217)
1 parent ae49a22 commit dd4bd25

14 files changed

+332
-302
lines changed

api-design/caching.mdx

Lines changed: 39 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -7,15 +7,15 @@ import { Callout } from "@/mdx/components";
77

88
# Caching API Responses
99

10-
API caching can save servers some serious work, cut down on costs, and even help reduce the carbon impact of an API. However, it is often considered an optimization rather than what it truly is: an integral part of API design.
10+
API caching can save servers some serious work, cut down on costs, and even help reduce the carbon impact of an API. However, it is often considered an optimization rather than what it truly is: an integral part of API design.
1111

1212
A fundamental part of REST is APIs declaring the "cacheability" of resources. When working with HTTP there are many amazing caching options available through HTTP Caching; a series of standards that power how the entire internet functions. This can be used to design more useful APIs, as well as being faster, cheaper, and more sustainable.
1313

1414
## What is HTTP caching?
1515

1616
HTTP caching tells API clients (like browsers, mobile apps, or other backend systems) if they need to ask for the same data over and over again, or if they can use data they already have. This is done with HTTP headers on responses that tell the client how long they can "hold onto" that response, or how to check if it's still valid.
1717

18-
This works very differently from server-side caching tools like Redis or Memcached, which cache data on the server.
18+
This works very differently from server-side caching tools like Redis or Memcached, which cache data on the server.
1919

2020
HTTP caching happens on client-side or on intermediary proxies like Content Delivery Networks (CDNs), acting as a proxy between the client and the server and storing responses for reuse whenever possible.
2121

@@ -88,32 +88,32 @@ All of this is done without the client needing to know anything about the data,
8888
Let's add these headers to a basic Express.js API to see how it might look on the server-side.
8989

9090
```js
91-
const express = require('express');
91+
const express = require("express");
9292
const app = express();
9393

94-
app.get('/api/resource', (req, res) => {
95-
const data = { message: "Hello, world!" }; // Simulated data
96-
const eTag = `"${Buffer.from(JSON.stringify(data)).toString('base64')}"`;
97-
98-
if (req.headers['if-none-match'] === eTag) {
99-
// Client has the latest version
100-
res.status(304).end();
101-
} else {
102-
// Serve the resource with cache headers
103-
res.set({
104-
'Cache-Control': 'max-age=3600', // Cache for 1 hour
105-
'ETag': eTag
106-
});
107-
res.json(data);
108-
}
94+
app.get("/api/resource", (req, res) => {
95+
const data = { message: "Hello, world!" }; // Simulated data
96+
const eTag = `"${Buffer.from(JSON.stringify(data)).toString("base64")}"`;
97+
98+
if (req.headers["if-none-match"] === eTag) {
99+
// Client has the latest version
100+
res.status(304).end();
101+
} else {
102+
// Serve the resource with cache headers
103+
res.set({
104+
"Cache-Control": "max-age=3600", // Cache for 1 hour
105+
ETag: eTag,
106+
});
107+
res.json(data);
108+
}
109109
});
110110

111-
app.listen(3000, () => console.log('API running on http://localhost:3000'));
111+
app.listen(3000, () => console.log("API running on http://localhost:3000"));
112112
```
113113

114114
The ETag is generated by hashing the data, then the server checks if the client has the latest version. If it does, it sends a `304 Not Modified` response, otherwise it sends the data with the `ETag` and `Cache-Control` headers.
115115

116-
In a real codebase, would be doing something like fetching from a datasource, or computing something that takes a while, so waiting for all of that to happen just to make an ETag is not ideal. Yes, it avoids turning that data in JSON and sending it over the wire, but if the API is going to ignore it and send an `304 Not Modified` header with no response, the data was loaded and hashed for no reason.
116+
In a real codebase, would be doing something like fetching from a datasource, or computing something that takes a while, so waiting for all of that to happen just to make an ETag is not ideal. Yes, it avoids turning that data in JSON and sending it over the wire, but if the API is going to ignore it and send an `304 Not Modified` header with no response, the data was loaded and hashed for no reason.
117117

118118
Instead, an ETag can be made from metadata, like the last updated timestamp of a database record.
119119

@@ -156,7 +156,10 @@ Using `Cache-Control` headers its possible to specify whether the response can b
156156
- `no-store` — The response can't be cached at all.
157157

158158
<Callout title="Note" type="info">
159-
When a response contains an `Authorization` header, it's automatically marked as `private` to prevent sensitive data from being cached by shared caches. This is another reason to use standard auth headers instead of using custom headers like `X-API-Key`.
159+
When a response contains an `Authorization` header, it's automatically marked
160+
as `private` to prevent sensitive data from being cached by shared caches.
161+
This is another reason to use standard auth headers instead of using custom
162+
headers like `X-API-Key`.
160163
</Callout>
161164

162165
## Which resources should be cached?
@@ -188,17 +191,17 @@ GET /invoices/645E79D9E14
188191
"id": "645E79D9E14",
189192
"invoiceNumber": "INV-2024-001",
190193
"customer": "Acme Corporation",
191-
"amountDue": 500.00,
192-
"amountPaid": 250.00,
194+
"amountDue": 500.0,
195+
"amountPaid": 250.0,
193196
"dateDue": "2024-08-15",
194197
"dateIssued": "2024-08-01",
195198
"datePaid": "2024-08-10",
196199
"items": [
197200
{
198201
"description": "Consulting Services",
199202
"quantity": 10,
200-
"unitPrice": 50.00,
201-
"total": 500.00
203+
"unitPrice": 50.0,
204+
"total": 500.0
202205
}
203206
],
204207
"customer": {
@@ -213,15 +216,15 @@ GET /invoices/645E79D9E14
213216
"payments": [
214217
{
215218
"date": "2024-08-10",
216-
"amount": 250.00,
219+
"amount": 250.0,
217220
"method": "Credit Card",
218221
"reference": "CC-1234"
219222
}
220223
]
221224
}
222225
```
223226

224-
This is a very common pattern, but it's not very cacheable. If the invoice is updated, the whole invoice is updated, and the whole invoice needs to be refreshed. If the customer is updated, the whole invoice is updated, and the whole invoice needs to be refreshed. If the payments are updated, the whole invoice is updated, and the whole invoice needs to be refreshed.
227+
This is a very common pattern, but it's not very cacheable. If the invoice is updated, the whole invoice is updated, and the whole invoice needs to be refreshed. If the customer is updated, the whole invoice is updated, and the whole invoice needs to be refreshed. If the payments are updated, the whole invoice is updated, and the whole invoice needs to be refreshed.
225228

226229
We can increase the cachability of most of this information by breaking it down into smaller resources:
227230

@@ -234,15 +237,15 @@ GET /invoices/645E79D9E14
234237
"id": "645E79D9E14",
235238
"invoiceNumber": "INV-2024-001",
236239
"customer": "Acme Corporation",
237-
"amountDue": 500.00,
240+
"amountDue": 500.0,
238241
"dateDue": "2024-08-15",
239242
"dateIssued": "2024-08-01",
240243
"items": [
241244
{
242245
"description": "Consulting Services",
243246
"quantity": 10,
244-
"unitPrice": 50.00,
245-
"total": 500.00
247+
"unitPrice": 50.0,
248+
"total": 500.0
246249
}
247250
],
248251
"links": {
@@ -257,15 +260,15 @@ Instead of mixing in payment information with the invoice, this example moves th
257260

258261
The customer data is also moved out of the invoice resource, because the `/customers/acme-corporation` resource already exists and reusing it avoids code duplication and maintenance burden. Considering the user flow of the application, the resource is likely already in the browser/client cache, which reduces load times for the invoice.
259262

260-
This API structure works regardless of what the data structure looks like. Perhaps all of the payment data are in an `invoices` SQL table, but still have `/invoices` and `/invoices/{id}/payments` endpoints. Over time as common extra functionality like partial payments is requested, these endpoints can remain the same, but the underlying database structure can be migrated to move payment-specific fields over to a `payments` database table.
263+
This API structure works regardless of what the data structure looks like. Perhaps all of the payment data are in an `invoices` SQL table, but still have `/invoices` and `/invoices/{id}/payments` endpoints. Over time as common extra functionality like partial payments is requested, these endpoints can remain the same, but the underlying database structure can be migrated to move payment-specific fields over to a `payments` database table.
261264

262-
Many would argue this is a better separation of concerns, it's easier to control permissions for who is allowed to see invoices and/or payments, and the API has drastically improved cachability by splitting out frequently changing information from rarely changing information.
265+
Many would argue this is a better separation of concerns, it's easier to control permissions for who is allowed to see invoices and/or payments, and the API has drastically improved cachability by splitting out frequently changing information from rarely changing information.
263266

264267
### Avoid mixing public and private data
265268

266-
Breaking things down into smaller, more manageable resources can separate frequently changing information from more stable data, but there are other design issues that can effect cachability: mixing public and private data.
269+
Breaking things down into smaller, more manageable resources can separate frequently changing information from more stable data, but there are other design issues that can effect cachability: mixing public and private data.
267270

268-
Take the example of a train travel booking API. There could be a Booking resource, specific to a single user with private data nobody else should see.
271+
Take the example of a train travel booking API. There could be a Booking resource, specific to a single user with private data nobody else should see.
269272

270273
```http
271274
GET /bookings/1234
@@ -310,15 +313,15 @@ There is no downside to caching this data, because it is the same for everyone.
310313

311314
## Content Delivery Networks (CDNs)
312315

313-
HTTP caching works well when clients use it, and many do automatically, like web browsers or systems with caching middleware. But it becomes even more powerful when combined with tools like [Fastly](https://www.fastly.com/) or [Varnish](https://www.varnish-software.com/products/varnish-cache/).
316+
HTTP caching works well when clients use it, and many do automatically, like web browsers or systems with caching middleware. But it becomes even more powerful when combined with tools like [Fastly](https://www.fastly.com/) or [Varnish](https://www.varnish-software.com/products/varnish-cache/).
314317

315318
These tools sit between the server and the client, acting like intelligent gatekeepers:
316319

317320
![A sequence diagram showing a Client, Cache Proxy, and Server. A web request travels from client to proxy, then is sent on to the server, showing a "cache miss". The response then travels back from the server to the cache proxy, and then is sent to the client](./assets/httpcachemiss.png)
318321

319322
![A sequence diagram showing a Client, Cache Proxy, and Server. A web request travels from client to proxy, but does not go to the server, showing show a "cache hit". The response is served from the cache proxy to the client without involving the server](./assets/httpcachehit.png)
320323

321-
Client-caching like this is certainly useful, but the real power of caching comes when API web traffic is routed through a caching proxy. Using hosted solutions like Fastly or AWS CloudFront, this could be a case of changing DNS settings. For self-hosted options like Varnish, instead of pointing DNS settings to a hosted solution somebody will need to spin up a server to act as the cache proxy.
324+
Client-caching like this is certainly useful, but the real power of caching comes when API web traffic is routed through a caching proxy. Using hosted solutions like Fastly or AWS CloudFront, this could be a case of changing DNS settings. For self-hosted options like Varnish, instead of pointing DNS settings to a hosted solution somebody will need to spin up a server to act as the cache proxy.
322325

323326
Many API gateway tools like Tyk and Zuplo have caching built in, so this functionality may already be available in the ecosystem and just need enabling.
324327

api-design/pagination.mdx

Lines changed: 17 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: "Pagination Best Practices in REST API Design"
33
description: "Implement efficient pagination in your API to handle large datasets, improve performance, and provide a better experience for API consumers."
44
---
55

6-
import { Callout } from '@/mdx/components';
6+
import { Callout } from "@/mdx/components";
77

88
# Paginating API responses
99

@@ -17,7 +17,7 @@ stuck into doing things the right way early on.
1717
At first it's easy to imagine that collections only have a few hundred records.
1818
That not be too taxing for the server to fetch from the database, turn into
1919
JSON, and send back to the client, but as soon as the collection is getting into
20-
thousands of records things start to fall apart in wild and unexpected ways.
20+
thousands of records things start to fall apart in wild and unexpected ways.
2121

2222
For example, a coworking company that expected to mostly host startups of 10-50
2323
people, but then Facebook and Amazon rock up with ~10,000 employees each, and
@@ -83,7 +83,7 @@ The best way to help the client is to give them links, which at first seems
8383
confusing but it's just
8484
[HATEOAS](https://apisyouwonthate.com/blog/rest-and-richardson-maturity-model/)
8585
(Hypermedia As The Engine Of Application State), also known as Hypermedia
86-
Controls.
86+
Controls.
8787

8888
It's a fancy way of saying "give them links for things they can do
8989
next" and in the context of pagination that means "give them links to the next
@@ -110,14 +110,14 @@ page, the previous page, the first page, and the last page."
110110
```
111111

112112
Whenever there is a `next` link, an API consumer can show a `next` button, or
113-
start loading the next page of data to allow for auto-scrolling.
113+
start loading the next page of data to allow for auto-scrolling.
114114

115115
If the `next` response returns data, it will give a 200 OK response and they can
116-
show the data.
116+
show the data.
117117

118118
If there is no data then it will still be a 200 OK but there will be an empty
119119
array, showing that everything was fine, but there is no data on that page right
120-
now.
120+
now.
121121

122122
**Ease of Use**
123123

@@ -134,10 +134,10 @@ now.
134134
**Consistency**
135135

136136
- Con: When a consumer loads the latest 10 records, then a new record is added
137-
to the database, then a user loads the second page, they'll see one of those
138-
records twice. This is because there is no such concept as a "page" in the
139-
database, just saying "grab me 10, now the next 10" does not differentiate which
140-
records they actually were.
137+
to the database, then a user loads the second page, they'll see one of those
138+
records twice. This is because there is no such concept as a "page" in the
139+
database, just saying "grab me 10, now the next 10" does not differentiate which
140+
records they actually were.
141141

142142
### Offset-Based Pagination
143143

@@ -208,7 +208,7 @@ Or with hypermedia controls in the JSON:
208208
**Consistency**
209209

210210
- Con: The same problems exist for offset pagination as page pagination, if
211-
more data has been added between the first request and second being made, the same record could show up in both pages.
211+
more data has been added between the first request and second being made, the same record could show up in both pages.
212212

213213
**See this in action**
214214

@@ -230,7 +230,7 @@ page, this could be a UUID, but it can be more dynamic than that.
230230

231231
APIs like Slack will base64 encode information with a field name and a value,
232232
even adding sorting logic, all wrapped up in an opaque string. For example,
233-
`dXNlcjpXMDdRQ1JQQTQ=` would represent `user:W07QCRPA4`.
233+
`dXNlcjpXMDdRQ1JQQTQ=` would represent `user:W07QCRPA4`.
234234

235235
Obfuscating the information like this aims to stop API consumers hard-coding
236236
values for the pagination, which allows for the API to change pagination logic
@@ -292,10 +292,10 @@ Choosing the right pagination strategy depends on the specific use case and
292292
dataset size.
293293

294294
Offset-based pagination is simple but may suffer from performance issues with
295-
large datasets.
295+
large datasets.
296296

297297
Cursor-based pagination offers better performance and consistency for large
298-
datasets but come with added complexity.
298+
datasets but come with added complexity.
299299

300300
Page-based pagination is user-friendly but shares similar performance concerns
301301
with offset-based pagination.
@@ -336,5 +336,7 @@ Adding or drastically changing pagination later could be a whole mess of
336336
backwards compatibility breaks.
337337

338338
<Callout title="Note" type="info">
339-
Pagination can be tricky to work with for API clients, but Speakeasy SDKs can help out. Learn about <a href="/docs/runtime/pagination">adding pagination</a> to your Speakeasy SDK.
339+
Pagination can be tricky to work with for API clients, but Speakeasy SDKs can
340+
help out. Learn about <a href="/docs/runtime/pagination">adding pagination</a>{" "}
341+
to your Speakeasy SDK.
340342
</Callout>

0 commit comments

Comments
 (0)