You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: api-design/caching.mdx
+39-36Lines changed: 39 additions & 36 deletions
Original file line number
Diff line number
Diff line change
@@ -7,15 +7,15 @@ import { Callout } from "@/mdx/components";
7
7
8
8
# Caching API Responses
9
9
10
-
API caching can save servers some serious work, cut down on costs, and even help reduce the carbon impact of an API. However, it is often considered an optimization rather than what it truly is: an integral part of API design.
10
+
API caching can save servers some serious work, cut down on costs, and even help reduce the carbon impact of an API. However, it is often considered an optimization rather than what it truly is: an integral part of API design.
11
11
12
12
A fundamental part of REST is APIs declaring the "cacheability" of resources. When working with HTTP there are many amazing caching options available through HTTP Caching; a series of standards that power how the entire internet functions. This can be used to design more useful APIs, as well as being faster, cheaper, and more sustainable.
13
13
14
14
## What is HTTP caching?
15
15
16
16
HTTP caching tells API clients (like browsers, mobile apps, or other backend systems) if they need to ask for the same data over and over again, or if they can use data they already have. This is done with HTTP headers on responses that tell the client how long they can "hold onto" that response, or how to check if it's still valid.
17
17
18
-
This works very differently from server-side caching tools like Redis or Memcached, which cache data on the server.
18
+
This works very differently from server-side caching tools like Redis or Memcached, which cache data on the server.
19
19
20
20
HTTP caching happens on client-side or on intermediary proxies like Content Delivery Networks (CDNs), acting as a proxy between the client and the server and storing responses for reuse whenever possible.
21
21
@@ -88,32 +88,32 @@ All of this is done without the client needing to know anything about the data,
88
88
Let's add these headers to a basic Express.js API to see how it might look on the server-side.
89
89
90
90
```js
91
-
constexpress=require('express');
91
+
constexpress=require("express");
92
92
constapp=express();
93
93
94
-
app.get('/api/resource', (req, res) => {
95
-
constdata= { message:"Hello, world!" }; // Simulated data
"Cache-Control":"max-age=3600", // Cache for 1 hour
105
+
ETag: eTag,
106
+
});
107
+
res.json(data);
108
+
}
109
109
});
110
110
111
-
app.listen(3000, () =>console.log('API running on http://localhost:3000'));
111
+
app.listen(3000, () =>console.log("API running on http://localhost:3000"));
112
112
```
113
113
114
114
The ETag is generated by hashing the data, then the server checks if the client has the latest version. If it does, it sends a `304 Not Modified` response, otherwise it sends the data with the `ETag` and `Cache-Control` headers.
115
115
116
-
In a real codebase, would be doing something like fetching from a datasource, or computing something that takes a while, so waiting for all of that to happen just to make an ETag is not ideal. Yes, it avoids turning that data in JSON and sending it over the wire, but if the API is going to ignore it and send an `304 Not Modified` header with no response, the data was loaded and hashed for no reason.
116
+
In a real codebase, would be doing something like fetching from a datasource, or computing something that takes a while, so waiting for all of that to happen just to make an ETag is not ideal. Yes, it avoids turning that data in JSON and sending it over the wire, but if the API is going to ignore it and send an `304 Not Modified` header with no response, the data was loaded and hashed for no reason.
117
117
118
118
Instead, an ETag can be made from metadata, like the last updated timestamp of a database record.
119
119
@@ -156,7 +156,10 @@ Using `Cache-Control` headers its possible to specify whether the response can b
156
156
-`no-store` — The response can't be cached at all.
157
157
158
158
<Callouttitle="Note"type="info">
159
-
When a response contains an `Authorization` header, it's automatically marked as `private` to prevent sensitive data from being cached by shared caches. This is another reason to use standard auth headers instead of using custom headers like `X-API-Key`.
159
+
When a response contains an `Authorization` header, it's automatically marked
160
+
as `private` to prevent sensitive data from being cached by shared caches.
161
+
This is another reason to use standard auth headers instead of using custom
162
+
headers like `X-API-Key`.
160
163
</Callout>
161
164
162
165
## Which resources should be cached?
@@ -188,17 +191,17 @@ GET /invoices/645E79D9E14
188
191
"id": "645E79D9E14",
189
192
"invoiceNumber": "INV-2024-001",
190
193
"customer": "Acme Corporation",
191
-
"amountDue": 500.00,
192
-
"amountPaid": 250.00,
194
+
"amountDue": 500.0,
195
+
"amountPaid": 250.0,
193
196
"dateDue": "2024-08-15",
194
197
"dateIssued": "2024-08-01",
195
198
"datePaid": "2024-08-10",
196
199
"items": [
197
200
{
198
201
"description": "Consulting Services",
199
202
"quantity": 10,
200
-
"unitPrice": 50.00,
201
-
"total": 500.00
203
+
"unitPrice": 50.0,
204
+
"total": 500.0
202
205
}
203
206
],
204
207
"customer": {
@@ -213,15 +216,15 @@ GET /invoices/645E79D9E14
213
216
"payments": [
214
217
{
215
218
"date": "2024-08-10",
216
-
"amount": 250.00,
219
+
"amount": 250.0,
217
220
"method": "Credit Card",
218
221
"reference": "CC-1234"
219
222
}
220
223
]
221
224
}
222
225
```
223
226
224
-
This is a very common pattern, but it's not very cacheable. If the invoice is updated, the whole invoice is updated, and the whole invoice needs to be refreshed. If the customer is updated, the whole invoice is updated, and the whole invoice needs to be refreshed. If the payments are updated, the whole invoice is updated, and the whole invoice needs to be refreshed.
227
+
This is a very common pattern, but it's not very cacheable. If the invoice is updated, the whole invoice is updated, and the whole invoice needs to be refreshed. If the customer is updated, the whole invoice is updated, and the whole invoice needs to be refreshed. If the payments are updated, the whole invoice is updated, and the whole invoice needs to be refreshed.
225
228
226
229
We can increase the cachability of most of this information by breaking it down into smaller resources:
227
230
@@ -234,15 +237,15 @@ GET /invoices/645E79D9E14
234
237
"id": "645E79D9E14",
235
238
"invoiceNumber": "INV-2024-001",
236
239
"customer": "Acme Corporation",
237
-
"amountDue": 500.00,
240
+
"amountDue": 500.0,
238
241
"dateDue": "2024-08-15",
239
242
"dateIssued": "2024-08-01",
240
243
"items": [
241
244
{
242
245
"description": "Consulting Services",
243
246
"quantity": 10,
244
-
"unitPrice": 50.00,
245
-
"total": 500.00
247
+
"unitPrice": 50.0,
248
+
"total": 500.0
246
249
}
247
250
],
248
251
"links": {
@@ -257,15 +260,15 @@ Instead of mixing in payment information with the invoice, this example moves th
257
260
258
261
The customer data is also moved out of the invoice resource, because the `/customers/acme-corporation` resource already exists and reusing it avoids code duplication and maintenance burden. Considering the user flow of the application, the resource is likely already in the browser/client cache, which reduces load times for the invoice.
259
262
260
-
This API structure works regardless of what the data structure looks like. Perhaps all of the payment data are in an `invoices` SQL table, but still have `/invoices` and `/invoices/{id}/payments` endpoints. Over time as common extra functionality like partial payments is requested, these endpoints can remain the same, but the underlying database structure can be migrated to move payment-specific fields over to a `payments` database table.
263
+
This API structure works regardless of what the data structure looks like. Perhaps all of the payment data are in an `invoices` SQL table, but still have `/invoices` and `/invoices/{id}/payments` endpoints. Over time as common extra functionality like partial payments is requested, these endpoints can remain the same, but the underlying database structure can be migrated to move payment-specific fields over to a `payments` database table.
261
264
262
-
Many would argue this is a better separation of concerns, it's easier to control permissions for who is allowed to see invoices and/or payments, and the API has drastically improved cachability by splitting out frequently changing information from rarely changing information.
265
+
Many would argue this is a better separation of concerns, it's easier to control permissions for who is allowed to see invoices and/or payments, and the API has drastically improved cachability by splitting out frequently changing information from rarely changing information.
263
266
264
267
### Avoid mixing public and private data
265
268
266
-
Breaking things down into smaller, more manageable resources can separate frequently changing information from more stable data, but there are other design issues that can effect cachability: mixing public and private data.
269
+
Breaking things down into smaller, more manageable resources can separate frequently changing information from more stable data, but there are other design issues that can effect cachability: mixing public and private data.
267
270
268
-
Take the example of a train travel booking API. There could be a Booking resource, specific to a single user with private data nobody else should see.
271
+
Take the example of a train travel booking API. There could be a Booking resource, specific to a single user with private data nobody else should see.
269
272
270
273
```http
271
274
GET /bookings/1234
@@ -310,15 +313,15 @@ There is no downside to caching this data, because it is the same for everyone.
310
313
311
314
## Content Delivery Networks (CDNs)
312
315
313
-
HTTP caching works well when clients use it, and many do automatically, like web browsers or systems with caching middleware. But it becomes even more powerful when combined with tools like [Fastly](https://www.fastly.com/) or [Varnish](https://www.varnish-software.com/products/varnish-cache/).
316
+
HTTP caching works well when clients use it, and many do automatically, like web browsers or systems with caching middleware. But it becomes even more powerful when combined with tools like [Fastly](https://www.fastly.com/) or [Varnish](https://www.varnish-software.com/products/varnish-cache/).
314
317
315
318
These tools sit between the server and the client, acting like intelligent gatekeepers:
316
319
317
320

318
321
319
322

320
323
321
-
Client-caching like this is certainly useful, but the real power of caching comes when API web traffic is routed through a caching proxy. Using hosted solutions like Fastly or AWS CloudFront, this could be a case of changing DNS settings. For self-hosted options like Varnish, instead of pointing DNS settings to a hosted solution somebody will need to spin up a server to act as the cache proxy.
324
+
Client-caching like this is certainly useful, but the real power of caching comes when API web traffic is routed through a caching proxy. Using hosted solutions like Fastly or AWS CloudFront, this could be a case of changing DNS settings. For self-hosted options like Varnish, instead of pointing DNS settings to a hosted solution somebody will need to spin up a server to act as the cache proxy.
322
325
323
326
Many API gateway tools like Tyk and Zuplo have caching built in, so this functionality may already be available in the ecosystem and just need enabling.
Copy file name to clipboardExpand all lines: api-design/pagination.mdx
+17-15Lines changed: 17 additions & 15 deletions
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@ title: "Pagination Best Practices in REST API Design"
3
3
description: "Implement efficient pagination in your API to handle large datasets, improve performance, and provide a better experience for API consumers."
4
4
---
5
5
6
-
import { Callout } from'@/mdx/components';
6
+
import { Callout } from"@/mdx/components";
7
7
8
8
# Paginating API responses
9
9
@@ -17,7 +17,7 @@ stuck into doing things the right way early on.
17
17
At first it's easy to imagine that collections only have a few hundred records.
18
18
That not be too taxing for the server to fetch from the database, turn into
19
19
JSON, and send back to the client, but as soon as the collection is getting into
20
-
thousands of records things start to fall apart in wild and unexpected ways.
20
+
thousands of records things start to fall apart in wild and unexpected ways.
21
21
22
22
For example, a coworking company that expected to mostly host startups of 10-50
23
23
people, but then Facebook and Amazon rock up with ~10,000 employees each, and
@@ -83,7 +83,7 @@ The best way to help the client is to give them links, which at first seems
(Hypermedia As The Engine Of Application State), also known as Hypermedia
86
-
Controls.
86
+
Controls.
87
87
88
88
It's a fancy way of saying "give them links for things they can do
89
89
next" and in the context of pagination that means "give them links to the next
@@ -110,14 +110,14 @@ page, the previous page, the first page, and the last page."
110
110
```
111
111
112
112
Whenever there is a `next` link, an API consumer can show a `next` button, or
113
-
start loading the next page of data to allow for auto-scrolling.
113
+
start loading the next page of data to allow for auto-scrolling.
114
114
115
115
If the `next` response returns data, it will give a 200 OK response and they can
116
-
show the data.
116
+
show the data.
117
117
118
118
If there is no data then it will still be a 200 OK but there will be an empty
119
119
array, showing that everything was fine, but there is no data on that page right
120
-
now.
120
+
now.
121
121
122
122
**Ease of Use**
123
123
@@ -134,10 +134,10 @@ now.
134
134
**Consistency**
135
135
136
136
- Con: When a consumer loads the latest 10 records, then a new record is added
137
-
to the database, then a user loads the second page, they'll see one of those
138
-
records twice. This is because there is no such concept as a "page" in the
139
-
database, just saying "grab me 10, now the next 10" does not differentiate which
140
-
records they actually were.
137
+
to the database, then a user loads the second page, they'll see one of those
138
+
records twice. This is because there is no such concept as a "page" in the
139
+
database, just saying "grab me 10, now the next 10" does not differentiate which
140
+
records they actually were.
141
141
142
142
### Offset-Based Pagination
143
143
@@ -208,7 +208,7 @@ Or with hypermedia controls in the JSON:
208
208
**Consistency**
209
209
210
210
- Con: The same problems exist for offset pagination as page pagination, if
211
-
more data has been added between the first request and second being made, the same record could show up in both pages.
211
+
more data has been added between the first request and second being made, the same record could show up in both pages.
212
212
213
213
**See this in action**
214
214
@@ -230,7 +230,7 @@ page, this could be a UUID, but it can be more dynamic than that.
230
230
231
231
APIs like Slack will base64 encode information with a field name and a value,
232
232
even adding sorting logic, all wrapped up in an opaque string. For example,
233
-
`dXNlcjpXMDdRQ1JQQTQ=` would represent `user:W07QCRPA4`.
233
+
`dXNlcjpXMDdRQ1JQQTQ=` would represent `user:W07QCRPA4`.
234
234
235
235
Obfuscating the information like this aims to stop API consumers hard-coding
236
236
values for the pagination, which allows for the API to change pagination logic
@@ -292,10 +292,10 @@ Choosing the right pagination strategy depends on the specific use case and
292
292
dataset size.
293
293
294
294
Offset-based pagination is simple but may suffer from performance issues with
295
-
large datasets.
295
+
large datasets.
296
296
297
297
Cursor-based pagination offers better performance and consistency for large
298
-
datasets but come with added complexity.
298
+
datasets but come with added complexity.
299
299
300
300
Page-based pagination is user-friendly but shares similar performance concerns
301
301
with offset-based pagination.
@@ -336,5 +336,7 @@ Adding or drastically changing pagination later could be a whole mess of
336
336
backwards compatibility breaks.
337
337
338
338
<Callouttitle="Note"type="info">
339
-
Pagination can be tricky to work with for API clients, but Speakeasy SDKs can help out. Learn about <ahref="/docs/runtime/pagination">adding pagination</a> to your Speakeasy SDK.
339
+
Pagination can be tricky to work with for API clients, but Speakeasy SDKs can
340
+
help out. Learn about <ahref="/docs/runtime/pagination">adding pagination</a>{""}
0 commit comments