Skip to content

Latest commit

 

History

History
1014 lines (682 loc) · 69.9 KB

Guidelines.md

File metadata and controls

1014 lines (682 loc) · 69.9 KB

Microsoft Azure REST API Guidelines

History

Date Notes
2022-Jul-15 Update guidance on long-running operations
2022-May-11 Drop guidance on version discovery
2022-Mar-29 Add guidelines about using durations
2022-Mar-25 Update guideline for date values in headers to follow RFC 7231
2022-Feb-01 Updated error guidance
2021-Sep-11 Add long-running operations guidance
2021-Aug-06 Updated Azure REST Guidelines per Azure API Stewardship Board.
2020-Jul-31 Added service advice for initial versions
2020-Mar-31 1st public release of the Azure REST API Guidelines

Introduction

These guidelines offer prescriptive guidance that Azure service teams MUST follow ensuring that customers have a great experience by designing APIs meeting these goals:

  • Developer friendly via consistent patterns & web standards (HTTP, REST, JSON)
  • Efficient & cost-effective
  • Work well with SDKs in many programming languages
  • Customers can create fault-tolerant apps by supporting retries/idempotency/optimistic concurrency
  • Sustainable & versionable via clear API contracts with 2 requirements:
    1. Customer workloads must never break due to a service change
    2. Customers can adopt a version without requiring code changes

Technology and software is constantly changing and evolving, and as such, this is intended to be a living document. Open an issue to suggest a change or propose a new idea.

See the Considerations for Service Design for an introduction to the topic of API design for Azure services.

NOTE: For an existing GA'd service, don't change/break its existing API; instead, leverage these concepts for future APIs while prioritizing consistency within your existing service.

Prescriptive Guidance

This document offers prescriptive guidance labeled as follows:

DO adopt this pattern. If you feel you need an exception, contact the Azure HTTP/REST Stewardship Board prior to implementation.

☑️ YOU SHOULD adopt this pattern. If not following this advice, you MUST disclose your reason during the Azure HTTP/REST Stewardship Board review.

✔️ YOU MAY consider this pattern if appropriate to your situation. No notification to the Azure HTTP/REST Stewardship Board is required.

⚠️ YOU SHOULD NOT adopt this pattern. If not following this advice, you MUST disclose your reason during the Azure HTTP/REST Stewardship Board review.

DO NOT adopt this pattern. If you feel you need an exception, contact the Azure HTTP/REST Stewardship Board prior to implementation.

If you feel you need an exception, or need clarity based on your situation, please contact the Azure HTTP/REST Stewardship Board prior to release of your API.

Building Blocks: HTTP, REST, & JSON

The Microsoft Azure Cloud platform exposes its APIs through the core building blocks of the Internet; namely HTTP, REST, and JSON. This section provides you with a general understanding of how these technologies should be applied when creating your service.

HTTP

Azure services must adhere to the HTTP specification, RFC7231. This section further refines and constrains how service implementors should apply the constructs defined in the HTTP specification. It is therefore, important that you have a firm understanding of the following concepts:

Uniform Resource Locators (URLs)

A Uniform Resource Locator (URL) is how developers access the resources of your service. Ultimately, URLs are how developers form a cognitive model of your service's resources.

DO use this URL pattern:

https://<service>.<cloud>/<tenant>/<service-root>/<resource-collection>/<resource-id>/

Where:

Field Description
service Name of the service (ex: blobstore, servicebus, directory, or management)
cloud Cloud domain name, e.g. azure.net (see Azure CLI's "az cloud list")
tenant Globally-unique ID of container representing tenant isolation, billing, enforced quotas, lifetime of containers (ex: subscription UUID)
service‑root Service-specific path (ex: blobcontainer, myqueue)
resource‑collection Name of the collection, unabbreviated, pluralized
resource‑id Value of the unique id property. This MUST be the raw string/number/guid value with no quoting but properly escaped to fit in a URL segment.

DO use kebab-casing (preferred) or camel-casing for URL path segments. If the segment refers to a JSON field, use camel casing.

DO return 414-URI Too Long if a URL exceeds 2083 characters

DO treat service-defined URL path segments as case-sensitive. If the passed-in case doesn't match what the service expects, the request MUST fail with a 404-Not found HTTP return code.

Some customer-provided path segment values may be compared case-insensitivity if the abstraction they represent is normally compared with case-insensitivity. For example, a UUID path segment of 'c55f6b35-05f6-42da-8321-2af5099bd2a2' should be treated identical to 'C55F6B35-05F6-42DA-8321-2AF5099BD2A2'

DO ensure proper casing when returning a URL in an HTTP response header value or inside a JSON response body

DO restrict the characters in service-defined path segments to 0-9 A-Z a-z - . _ ~, with : allowed only as described below to designate an action operation.

☑️ YOU SHOULD restrict the characters allowed in user-specified path segments (i.e. path parameters values) to 0-9 A-Z a-z - . _ ~ (do not allow :).

☑️ YOU SHOULD keep URLs readable; if possible, avoid UUIDs & %-encoding (ex: Cádiz is %-encoded as C%C3%A1diz)

✔️ YOU MAY use these other characters in the URL path but they will likely require %-encoding [RFC 3986]: / ? # [ ] @ ! $ & ' ( ) * + , ; =

✔️ YOU MAY support a direct endpoint URL for performance/routing:

https://<tenant>-<service-root>.<service>.<cloud>/...

Examples:

  • Request URL: https://blobstore.azure.net/contoso.com/account1/container1/blob2
  • Response header (RFC2557): content-location : https://contoso-dot-com-account1.blobstore.azure.net/container1/blob2
  • GUID format: https://00000000-0000-0000-C000-000000000046-account1.blobstore.azure.net/container1/blob2

DO return URLs in response headers/bodies in a consistent form regardless of the URL used to reach the resource. Either always a UUID for <tenant> or always a single verified domain.

✔️ YOU MAY use URLs as values

https://api.contoso.com/items?url=https://resources.contoso.com/shoes/fancy

HTTP Request / Response Pattern

The HTTP Request / Response pattern dictates how your API behaves. For example: POST methods that create resources must be idempotent, GET method results may be cached, the If-Modified and ETag headers offer optimistic concurrency. The URL of a service, along with its request/response bodies, establishes the overall contract that developers have with your service. As a service provider, how you manage the overall request / response pattern should be one of the first implementation decisions you make.

Cloud applications embrace failure. Therefore, to enable customers to write fault-tolerant applications, all service operations (including POST) must be idempotent. Implementing services in an idempotent manner, with an "exactly once" semantic, enables developers to retry requests without the risk of unintended consequences.

Exactly Once Behavior = Client Retries & Service Idempotency

DO ensure that all HTTP methods are idempotent.

☑️ YOU SHOULD use PUT or PATCH to create a resource as these HTTP methods are easy to implement, allow the customer to name their own resource, and are idempotent.

✔️ YOU MAY use POST to create a resource but you must make it idempotent and, of course, the response MUST return the URL of the created resource with a 201-Created. One way to make POST idempotent is to use the Repeatability-Request-ID & Repeatability-First-Sent headers (See Repeatability of requests).

HTTP Return Codes

DO adhere to the return codes in the following table when the method completes synchronously and is successful:

Method Description Response Status Code
PATCH Create/Modify the resource with JSON Merge Patch 200-OK, 201-Created
PUT Create/Replace the whole resource 200-OK, 201-Created
POST Create new resource (ID set by service) 201-Created with URL of created resource
POST Action 200-OK, 204-No Content (only when nothing returned in response body)
GET Read (i.e. list) a resource collection 200-OK
GET Read the resource 200-OK
DELETE Remove the resource 204-No Content; avoid 404-Not Found

DO return status code 202-Accepted and follow the guidance in Long-Running Operations & Jobs when a PUT, POST, or DELETE method completes asynchronously.

DO treat method names as case sensitive and should always be in uppercase

DO return the state of the resource after a PUT, PATCH, POST, or GET operation with a 200-OK or 201-Created.

DO return a 204-No Content without a resource/body for a DELETE operation (even if the URL identifies a resource that does not exist; do not return 404-Not Found)

DO return a 403-Forbidden when the user does not have access to the resource unless this would leak information about the existence of the resource that should not be revealed for security/privacy reasons, in which case the response should be 404-Not Found. [Rationale: a 403-Forbidden is easier to debug for customers, but should not be used if even admitting the existence of things could potentially leak customer secrets.]

DO support caching and optimistic concurrency by honoring the the If-Match, If-None-Match, if-modified-since, and if-unmodified-since request headers and by returning the ETag and last-modified response headers

HTTP Query Parameters and Header Values

Because information in the service URL, as well as the request / response, are strings, there must be a predictable, well-defined scheme to convert strings to their corresponding values.

DO validate all query parameter and request header values and fail the operation with 400-Bad Request if any value fails validation. Return an error response as described in the Handling Errors section indicating what is wrong so customer can diagnose the issue and fix it themselves.

DO use the following table when translating strings:

Data type Document that string must be
Boolean true / false (all lowercase)
Integer -253+1 to +253-1 (for consistency with JSON limits on integers RFC8259)
Float IEEE-754 binary64
String (Un)quoted?, max length, legal characters, case-sensitive, multiple delimiter
UUID 123e4567-e89b-12d3-a456-426614174000 (no {}s, hyphens, case-insensitive) RFC4122
Date/Time (Header) Sun, 06 Nov 1994 08:49:37 GMT RFC7231, Section 7.1.1.1
Date/Time (Query parameter) YYYY-MM-DDTHH:mm:ss.sssZ (with at most 3 digits of fractional seconds) RFC3339
Byte array Base-64 encoded, max length
Array One of a) a comma-separated list of values (preferred), or b) separate name=value parameter instances for each value of the array

The table below lists the headers most used by Azure services:

Header Key Applies to Example
authorization Request Bearer eyJ0...Xd6j (Support Azure Active Directory)
x-ms-useragent Request (see Distributed Tracing & Telemetry)
traceparent Request (see Distributed Tracing & Telemetry)
tracecontext Request (see Distributed Tracing & Telemetry)
accept Request application/json
If-Match Request "67ab43" or * (no quotes) (see Conditional Requests)
If-None-Match Request "67ab43" or * (no quotes) (see Conditional Requests)
If-Modified-Since Request Sun, 06 Nov 1994 08:49:37 GMT (see Conditional Requests)
If-Unmodified-Since Request Sun, 06 Nov 1994 08:49:37 GMT (see Conditional Requests)
date Both Sun, 06 Nov 1994 08:49:37 GMT (see RFC7231, Section 7.1.1.2)
content-type Both application/merge-patch+json
content-length Both 1024
x-ms-request-id Response 4227cdc5-9f48-4e84-921a-10967cb785a0
ETag Response "67ab43" (see Conditional Requests)
last-modified Response Sun, 06 Nov 1994 08:49:37 GMT
x-ms-error-code Response (see Handling Errors)
retry-after Response 180 (see RFC 7231, Section 7.1.3)

DO support all headers shown in italics

DO specify headers using kebab-casing

DO compare request header names using case-insensitivity

DO compare request header values using case-sensitivity if the header name requires it

DO accept date values in headers in HTTP-Date format and return date values in headers in the IMF-fixdate format as defined in RFC7231, Section 7.1.1.1, e.g. "Sun, 06 Nov 1994 08:49:37 GMT".

Note: The RFC 7321 IMF-fixdate format is a "fixed-length and single-zone subset" of the RFC 1123 / RFC 5822 format, which means: a) year must be four digits, b) the seconds component of time is required, and c) the timezone must be GMT.

DO create an opaque value that uniquely identifies the request and return this value in the x-ms-request-id response header.

Your service should include the x-ms-request-id value in error logs so that users can submit support requests for specific failures using this value.

DO NOT fail a request that contains an unrecognized header. Headers may be added by API gateways or middleware and this must be tolerated

DO NOT use "x-" prefix for custom headers, unless the header already exists in production [RFC 6648].

Additional References

REpresentational State Transfer (REST)

REST is an architectural style with broad reach that emphasizes scalability, generality, independent deployment, reduced latency via caching, and security. When applying REST to your API, you define your service’s resources as a collections of items. These are typically the nouns you use in the vocabulary of your service. Your service's URLs determine the hierarchical path developers use to perform CRUD (create, read, update, and delete) operations on resources. Note, it's important to model resource state, not behavior. There are patterns, later in these guidelines, that describe how to invoke behavior on your service. See this article in the Azure Architecture Center for a more detailed discussion of REST API design patterns.

When designing your service, it is important to optimize for the developer using your API.

DO focus heavily on clear & consistent naming

DO ensure your resource paths make sense

DO simplify operations with few required query parameters & JSON fields

DO establish clear contracts for string values

DO use proper response codes/bodies so customer can diagnose their own problems and fix them without contacting Azure support or the service team

JSON Resource Schema & Field Mutability

DO use the same JSON schema for PUT request/response, PATCH response, GET response, and POST request/response on a given URL path. The PATCH request schema should contain all the same fields with no required fields. This allows one SDK type for input/output operations and enables the response to be passed back in a request.

DO think about your resource's fields and how they are used:

Field Mutability Service Request's behavior for this field
Create Service honors field only when creating a resource. Minimize create-only fields so customers don't have to delete & re-create the resource.
Update Service honors field when creating or updating a resource
Read Service returns this field in a response. If the client passed a read-only field, the service MUST fail the request unless the passed-in value matches the resource's current value

In addition to the above, a field may be "required" or "optional". A required field is guaranteed to always exist and will typically not become a nullable field in a SDK's data structure. This allows customers to write code without performing a null-check. Because of this, required fields can only be introduced in the 1st version of a service; it is a breaking change to introduce required fields in a later version. In addition, it is a breaking change to remove a required field or make an optional field required or vice versa.

DO make fields simple and maintain a shallow hierarchy.

DO use camel case for all JSON field names. Do not upper-case acronyms; use camel case.

DO treat JSON field names with case-sensitivity.

DO treat JSON field values with case-sensitivity. There may be some exceptions (e.g. GUIDs) but avoid if at all possible.

DO use GET for resource retrieval and return JSON in the response body

DO create and update resources using PATCH [RFC5789] with JSON Merge Patch (RFC7396) request body.

DO use PUT with JSON for wholesale create/update operations. NOTE: If a v1 client PUTs a resource; any fields introduced in V2+ should be reset to their default values (the equivalent to DELETE followed by PUT).

DO use DELETE to remove a resource.

DO fail an operation with 400-Bad Request if the request is improperly-formed or if any JSON field name or value is not fully understood by the specific version of the service. Return an error response as described in Handling errors indicating what is wrong so customer can diagnose the issue and fix it themselves.

✔️ YOU MAY return secret fields via POST if absolutely necessary.

DO NOT return secret fields via GET. For example, do not return administratorPassword in JSON.

DO NOT add fields to the JSON if the value is easily computable from other fields to avoid bloating the body.

Create / Update / Replace Processing Rules

DO follow the processing below to create/update/replace a resource:

When using this method if this condition happens use this response code
PATCH/PUT Any JSON field name/value not known/valid 400-Bad Request
PATCH/PUT Any Read field passed (client can't set Read fields) 400-Bad Request
If the resource does not exist
PATCH/PUT Any mandatory Create/Update field missing 400-Bad Request
PATCH/PUT Create resource using Create/Update fields 201-Created
If the resource already exists
PATCH Any Create field doesn't match current value (allows retries) 409-Conflict
PATCH Update resource using Update fields 200-OK
PUT Any mandatory Create/Update field missing 400-Bad Request
PUT Overwrite resource entirely using Create/Update fields 200-OK

Handling Errors

There are 2 kinds of errors:

  • An error where you expect customer code to gracefully recover at runtime
  • An error indicating a bug in customer code that is unlikely to be recoverable at runtime; the customer must just fix their code

DO return an x-ms-error-code response header with a string error code indicating what went wrong.

NOTE: x-ms-error-code values are part of your API contract (because customer code is likely to do comparisons against them) and cannot change in the future.

✔️ YOU MAY implement the x-ms-error-code values as an enum with "modelAsString": true because it's possible add new values over time. In particular, it's only a breaking change if the same conditions result in a different top-level error code.

⚠️ YOU SHOULD NOT add new top-level error codes to an existing API without bumping the service version.

DO carefully craft unique x-ms-error-code string values for errors that are recoverable at runtime. Reuse common error codes for usage errors that are not recoverable.

✔️ YOU MAY group common customer code errors into a few x-ms-error-code string values.

DO ensure that the top-level error's code value is identical to the x-ms-error-code header's value.

DO provide a response body with the following structure:

ErrorResponse : Object

Property Type Required Description
error ErrorDetail The top-level error object whose code matches the x-ms-error-code response header

ErrorDetail : Object

Property Type Required Description
code String One of a server-defined set of error codes.
message String A human-readable representation of the error.
target String The target of the error.
details ErrorDetail[] An array of details about specific errors that led to this reported error.
innererror InnerError An object containing more specific information than the current object about the error.
additional properties Additional properties that can be useful when debugging.

InnerError : Object

Property Type Required Description
code String A more specific error code than was provided by the containing error.
innererror InnerError An object containing more specific information than the current object about the error.

Example:

{
  "error": {
    "code": "InvalidPasswordFormat",
    "message": "Human-readable description",
    "target": "target of error",
    "innererror": {
      "code": "PasswordTooShort",
      "minLength": 6,
    }
  }
}

DO document the service's top-level error code strings; they are part of the API contract.

✔️ YOU MAY treat the other fields as you wish as they are not considered part of your service's API contract and customers should not take a dependency on them or their value. They exist to help customers self-diagnose issues.

✔️ YOU MAY add additional properties for any data values in your error message so customers don't resort to parsing your error message. For example, an error with "message": "A maximum of 16 keys are allowed per account." might also add a "maximumKeys": 16 property. This is not part of your API contract and should only be used for diagnosing problems.

Note: Do not use this mechanism to provide information developers need to rely on in code (ex: the error message can give details about why you've been throttled, but the Retry-After should be what developers rely on to back off).

⚠️ YOU SHOULD NOT document specific error status codes in your OpenAPI/Swagger spec unless the "default" response cannot properly describe the specific error response (e.g. body schema is different).

JSON

Services, and the clients that access them, may be written in multiple languages. To ensure interoperability, JSON establishes the "lowest common denominator" type system, which is always sent over the wire as UTF-8 bytes. This system is very simple and consists of three types:

Type Description
Boolean true/false (always lowercase)
Number Signed floating point (IEEE-754 binary64; int range: -253+1 to +253-1)
String Used for everything else

DO use integers within the acceptable range of JSON number.

DO establish a well-defined contract for the format of strings. For example, determine maximum length, legal characters, case-(in)sensitive comparisons, etc. Where possible, use standard formats, e.g. RFC3339 for date/time.

DO use strings formats that are well-known and easily parsable/formattable by many programming languages, e.g. RFC3339 for date/time.

DO ensure that information exchanged between your service and any client is "round-trippable" across multiple programming languages.

DO use RFC3339 for date/time.

DO use a fixed time interval to express durations e.g., milliseconds, seconds, minutes, days, etc., and include the time unit in the property name e.g., backupTimeInMinutes or ttlSeconds.

✔️ YOU MAY use RFC3339 time intervals only when users must be able to specify a time interval that may change from month to month or year to year e.g., "P3M" represents 3 months no matter how many days between the start and end dates, or "P1Y" represents 366 days on a leap year. The value must be round-trippable.

DO use RFC4122 for UUIDs.

✔️ YOU MAY use JSON objects to group sub-fields together.

✔️ YOU MAY use JSON arrays if maintaining an order of values is required. Avoid arrays in other situations since arrays can be difficult and inefficient to work with, especially with JSON Merge Patch where the entire array needs to be read prior to any operation being applied to it.

☑️ YOU SHOULD use JSON objects instead of arrays whenever possible.

Enums & SDKs (Client libraries)

It is common for strings to have an explicit set of values. These are often reflected in the OpenAPI definition as enumerations. These are extremely useful for developer tooling, e.g. code completion, and client library generation.

However, it is not uncommon for the set of values to grow over the life of a service. For this reason, Microsoft's tooling uses the concept of an "extensible enum," which indicates that the set of values should be treated as only a partial list. This indicates to client libraries and customers that values of the enumeration field should be effectively treated as strings and that undocumented value may returned in the future. This enables the set of values to grow over time while ensuring stability in client libraries and customer code.

☑️ YOU SHOULD use extensible enumerations unless you are positive that the symbol set will NEVER change over time.

DO document to customers that new values may appear in the future so that customers write their code today expecting these new values tomorrow.

DO NOT remove values from your enumeration list as this breaks customer code.

Polymorphic types

⚠️ YOU SHOULD NOT use polymorphic JSON types because they greatly complicate the customer code due to runtime dynamic casts and the introduction of new types in the future.

If you can't avoid them, then follow the guideline below.

DO define a kind field indicating the kind of the resource and include any kind-specific fields in the body.

Below is an example of JSON for a Rectangle and Circle: Rectangle

{
   "kind": "rectangle",
   "x": 100,
   "y": 50,
   "width": 10,
   "length": 24,
   "fillColor": "Red",
   "lineColor": "White",
   "subscription": {
      "kind": "free"
   }
}

Circle

{
  "kind": "circle",
  "x": 100,
  "y": 50,
  "radius": 10,
  "fillColor": "Green",
  "lineColor": "Black",
  "subscription": {
     "kind": "paid",
     "expiration": "2024",
     "invoice": "123456"
  }
}

Both Rectangle and Circle have common fields: kind, fillColor, lineColor, and subscription. A Rectangle also has x, y, width, and length while a Circle has x, y, and radius. The subscription is a nested polymorphic type. A free subscription has no additional fields and a paid subscription has expiration and invoice fields.

Common API Patterns

Performing an Action

The REST specification is used to model the state of a resource, and is primarily intended to handle CRUD (Create, Read, Update, Delete) operations. However, many services require the ability to perform an action on a resource, e.g. getting the thumbnail of an image or rebooting a VM. It is also sometimes useful to perform an action on a collection.

☑️ YOU SHOULD pattern your URL like this to perform an action on a resource URL Pattern

https://.../<resource-collection>/<resource-id>:<action>?<input parameters>

Example

https://.../users/Bob:grant?access=read

☑️ YOU SHOULD pattern your URL like this to perform an action on a collection URL Pattern

https://.../<resource-collection>:<action>?<input parameters>

Example

https://.../users:grant?access=read

Note: To avoid potential collision of actions and resource ids, you should disallow the use of the ":" character in resource ids.

DO use a POST operation for any action on a resource or collection.

DO support the Repeatability-Request-ID & Repeatability-First-Sent request headers if the action needs to be idempotent if retries occur.

DO return a 200-OK when the action completes synchronously and successfully.

☑️ YOU SHOULD use a verb as the <action> component of the path.

DO NOT use an action operation when the operation behavior could reasonably be defined as one of the standard REST Create, Read, Update, Delete, or List operations.

Collections

DO structure the response to a list operation as an object with a top-level array field containing the set (or subset) of resources.

☑️ YOU SHOULD support paging today if there is ever a chance in the future that the number of items can grow to be very large.

NOTE: It is a breaking change to add paging in the future

✔️ YOU MAY expose an operation that lists your resources by supporting a GET method with a URL to a resource-collection (as opposed to a resource-id).

Example Response Body

{
    "value": [
       { "id": "Item 01", "etag": "0xabc", "price": 99.95, "sizes": null },
       { },
       { },
       { "id": "Item 99", "etag": "0xdef", "price": 59.99, "sizes": null }
    ],
    "nextLink": "{opaqueUrl}"
 }

DO include the id field and etag field (if supported) for each item as this allows the customer to modify the item in a future operation.

DO clearly document that resources may be skipped or duplicated across pages of a paginated collection unless the operation has made special provisions to prevent this (like taking a time-expiring snapshot of the collection).

DO return a nextLink field with an absolute URL that the client can GET in order to retrieve the next page of the collection.

DO include any query parameters required by the service in nextLink, including api-version.

☑️ YOU SHOULD use value as the name of the top-level array field unless a more appropriate name is available.

DO NOT return the nextLink field at all when returning the last page of the collection.

DO NOT ever return a nextLink field with a value of null.

Query options

✔️ YOU MAY support the following query parameters allowing customers to control the list operation:

Parameter name Type Description
filter string an expression on the resource type that selects the resources to be returned
orderby string array a list of expressions that specify the order of the returned resources
skip integer an offset into the collection of the first resource to be returned
top integer the maximum number of resources to return from the collection
maxpagesize integer the maximum number of resources to include in a single response
select string array a list of field names to be returned for each resource
expand string array a list of the related resources to be included in line with each resource

DO return an error if the client specifies any parameter not supported by the service.

DO treat these query parameter names as case-sensitive.

DO apply select or expand options after applying all the query options in the table above.

DO apply the query options to the collection in the order shown in the table above.

DO NOT prefix any of these query parameter names with "$" (the convention in the OData standard).

filter

✔️ YOU MAY support filtering of the results of a list operation with the filter query parameter.

The value of the filter option is an expression involving the fields of the resource that produces a Boolean value. This expression is evaluated for each resource in the collection and only items where the expression evaluates to true are included in the response.

DO omit all resources from the collection for which the filter expression evaluates to false or to null, or references properties that are unavailable due to permissions.

Example: return all Products whose Price is less than $10.00

GET https://api.contoso.com/products?`filter`=price lt 10.00
filter operators

✔️ YOU MAY support the following operators in filter expressions:

Operator Description Example
Comparison Operators
eq Equal city eq 'Redmond'
ne Not equal city ne 'London'
gt Greater than price gt 20
ge Greater than or equal price ge 10
lt Less than price lt 20
le Less than or equal price le 100
Logical Operators
and Logical and price le 200 and price gt 3.5
or Logical or price le 3.5 or price gt 200
not Logical negation not price le 3.5
Grouping Operators
( ) Precedence grouping (priority eq 1 or city eq 'Redmond') and price gt 100

DO respond with an error message as defined in the Handling Errors section if a client includes an operator in a filter expression that is not supported by the operation.

DO use the following operator precedence for supported operators when evaluating filter expressions. Operators are listed by category in order of precedence from highest to lowest. Operators in the same category have equal precedence and should be evaluated left to right:

Group Operator Description
Grouping ( ) Precedence grouping
Unary not Logical Negation
Relational gt Greater Than
ge Greater than or Equal
lt Less Than
le Less than or Equal
Equality eq Equal
ne Not Equal
Conditional AND and Logical And
Conditional OR or Logical Or

✔️ YOU MAY support orderby and filter functions such as concat and contains. For more information, see odata Canonical Functions.

Operator examples

The following examples illustrate the use and semantics of each of the logical operators.

Example: all products with a name equal to 'Milk'

GET https://api.contoso.com/products?`filter`=name eq 'Milk'

Example: all products with a name not equal to 'Milk'

GET https://api.contoso.com/products?`filter`=name ne 'Milk'

Example: all products with the name 'Milk' that also have a price less than 2.55:

GET https://api.contoso.com/products?`filter`=name eq 'Milk' and price lt 2.55

Example: all products that either have the name 'Milk' or have a price less than 2.55:

GET https://api.contoso.com/products?`filter`=name eq 'Milk' or price lt 2.55

Example: all products that have the name 'Milk' or 'Eggs' and have a price less than 2.55:

GET https://api.contoso.com/products?`filter`=(name eq 'Milk' or name eq 'Eggs') and price lt 2.55

orderby

✔️ YOU MAY support sorting of the results of a list operation with the orderby query parameter. NOTE: It is unusual for a service to support orderby because it is very expensive to implement as it requires sorting the entire large collection before being able to return any results.

The value of the orderby parameter is a comma-separated list of expressions used to sort the items. A special case of such an expression is a property path terminating on a primitive property.

Each expression in the orderby parameter value may include the suffix "asc" for ascending or "desc" for descending, separated from the expression by one or more spaces.

DO sort the collection in ascending order on an expression if "asc" or "desc" is not specified.

DO sort NULL values as "less than" non-NULL values.

DO sort items by the result values of the first expression, and then sort items with the same value for the first expression by the result value of the second expression, and so on.

DO use the inherent sort order for the type of the field. For example, date-time values should be sorted chronologically and not alphabetically.

DO respond with an error message as defined in the Handling Errors section if the client requests sorting by a field that is not supported by the operation.

For example, to return all people sorted by name in ascending order:

GET https://api.contoso.com/people?orderby=name

For example, to return all people sorted by name in descending order and a secondary sort order of hireDate in ascending order.

GET https://api.contoso.com/people?orderby=name desc,hireDate

Sorting MUST compose with filtering such that:

GET https://api.contoso.com/people?`filter`=name eq 'david'&orderby=hireDate

will return all people whose name is David sorted in ascending order by hireDate.

Considerations for sorting with pagination

DO use the same filtering options and sort order for all pages of a paginated list operation response.

skip

DO define the skip parameter as an integer with a default and minimum value of 0.

✔️ YOU MAY allow clients to pass the skip query parameter to specify an offset into collection of the first resource to be returned.

top

✔️ YOU MAY allow clients to pass the top query parameter to specify the maximum number of resources to return from the collection.

If supporting top: :white_check_mark: DO define the top parameter as an integer with a minimum value of 1. If not specified, top has a default value of infinity.

DO return the collection's top number of resources (if available), starting from skip.

maxpagesize

✔️ YOU MAY allow clients to pass the maxpagesize query parameter to specify the maximum number of resources to include in a single page response.

DO define the maxpagesize parameter as an optional integer with a default value appropriate for the collection.

DO make clear in documentation of the maxpagesize parameter that the operation may choose to return fewer resources than the value specified.

API Versioning

Azure services need to change over time. However, when changing a service, there are 2 requirements:

  1. Already-running customer workloads must not break due to a service change
  2. Customers can adopt a new service version without requiring any code changes (Of course, the customer must modify code to leverage any new service features.)

NOTE: the Azure Breaking Change Policy has tables (section 5) describing what kinds of changes are considered breaking. Breaking changes are allowable (due to security/compliance/etc.) if approved by the Azure Breaking Change Reviewers but only following ample communication to customers and a lengthy deprecation period.

DO review any API changes with the Azure API Stewardship Board

Clients specify the version of the API to be used in every request to the service, even requests to an Operation-Location or nextLink URL returned by the service.

DO use a required query parameter named api-version on every operation for the client to specify the API version.

DO use YYYY-MM-DD date values, with a -preview suffix for preview versions, as the valid values for api-version.

PUT https://service.azure.com/users/Jeff?api-version=2021-06-04

DO use a later date for each new preview version

When releasing a new preview, the service team may completely retire any previous preview versions after giving customers at least 90 days to upgrade their code

DO use a later date for successive preview versions.

DO NOT introduce any breaking changes into the service.

DO NOT include a version number segment in any operation path.

DO NOT use the same date when transitioning from a preview API to a GA API. If the preview api-version is '2021-06-04-preview', the GA version of the API must be a date later than 2021-06-04

DO NOT keep a preview feature in preview for more than 1 year; it must go GA (or be removed) within 1 year after introduction.

Use Extensible Enums

While removing a value from an enum is a breaking change, adding value to an enum can be handled with an extensible enum. An extensible enum is a string value that has been marked with a special marker - setting modelAsString to true within an x-ms-enum block. For example:

"createdByType": {
   "type": "string",
   "description": "The type of identity that created the resource.",
   "enum": [
      "User",
      "Application",
      "ManagedIdentity",
      "Key"
   ],
   "x-ms-enum": {
      "name": "createdByType",
      "modelAsString": true
   }
}

☑️ You SHOULD use extensible enums unless you are positive that the symbol set will NEVER change over time.

Repeatability of requests

The ability to retry failed requests for which a client never received a response greatly simplifies the ability to write resilient distributed applications. While HTTP designates some methods as safe and/or idempotent (and thus retryable), being able to retry other operations such as create-using-POST-to-collection is desirable.

☑️ YOU SHOULD support repeatable requests according as defined in OASIS Repeatable Requests Version 1.0.

  • The tracked time window (difference between the Repeatability-First-Sent value and the current time) MUST be at least 5 minutes.
  • A service advertises support for repeatability requests by adding the Repeatability-First-Sent and Repeatability-Request-ID to the set of headers for a given operation.
  • When understood, all endpoints co-located behind a DNS name MUST understand the header. This means that a service MUST NOT ignore the presence of a header for any endpoints behind the DNS name, but rather fail the request containing a Repeatability-Request-ID header if that particular endpoint lacks support for repeatable requests. Such partial support SHOULD be avoided due to the confusion it causes for clients.

Long-Running Operations & Jobs

When the processing for an operation may take a significant amount of time to complete, it should be implemented as a long-running operation (LRO). This allows clients to continue running while the operation is being processed. The client obtains the outcome of the operation at some later time through another API call. See the Long Running Operations section in Considerations for Service Design for an introduction to the design of long-running operations.

DO implement an operation as an LRO if the 99th percentile response time is greater than 1s.

DO NOT implement PATCH as an LRO. If LRO update is required it must be implemented with POST.

In rare instances where an operation may take a very long time to complete, e.g. longer than 15 minutes, it may be better to expose this as a first class resource of the API rather than as an operation on another resource.

There are two basic patterns for long-running operations in Azure. The first pattern is used for a POST and DELETE operations that initiate the LRO. These return a 202 Accepted response with a JSON status monitor in the response body. The second pattern applies only in the case of a PUT operation to create a resource that also involves additional long-running processing. For guidance on when to use a specific pattern, please refer to Considerations for Service Design, Long Running Operations. These are described in the following two sections.

POST or DELETE LRO pattern

A POST or DELETE long-running operation accepts a request from the client to initiate the operation processing and returns a status monitor that reports the operation's progress.

DO NOT use a long-running POST to create a resource -- use PUT as described below.

DO allow the client to pass an Operation-Id header with an ID for the operation's status monitor.

DO generate an ID (typically a GUID) for the status monitor if the Operation-Id header was not passed by the client.

DO fail a request with a 400-BadRequest if the Operation-Id header matches an existing operation unless the request is identical to the prior request (a retry scenario).

DO perform as much validation as practical when initiating the operation to alert clients of errors early.

DO return a 202-Accepted status code from the request that initiates an LRO if the processing of the operation was successfully initiated (except for "PUT with additional processing" type LRO).

⚠️ YOU SHOULD NOT return any other 2xx status code from the initial request of an LRO -- return 202-Accepted and a status monitor even if processing was completed before the initiating request returns.

DO return a status monitor in the response body as described in Obtaining status and results of long-running operations.

☑️ YOU SHOULD include an Operation-Location header in the response with the absolute URL of the status monitor for the operation.

☑️ YOU SHOULD include the api-version query parameter in the Operation-Location header with the same version passed on the initial request if it is required by the get operation on the status monitor.

☑️ YOU SHOULD allow any valid value of the api-version query parameter to be used in the get operation on the status monitor.

PUT operation with additional long-running processing

For a PUT (create or replace) with additional long-running processing:

DO allow the client to pass an Operation-Id header with a ID for the status monitor for the operation.

DO generate an ID (typically a GUID) for the status monitor if the Operation-Id header was not passed by the client.

DO fail a request with a 400-BadRequest if the Operation-Id header that matches an existing operation unless the request is identical to the prior request (a retry scenario).

DO perform as much validation as practical when initiating the operation to alert clients of errors early.

DO return a 201-Created status code for create or 200-OK for replace from the initial request with a representation of the resource if the resource was created successfully.

DO include an Operation-Id header in the response with the ID of the status monitor for the operation.

DO include response headers with any additional values needed for a GET request to the status monitor (e.g. location).

☑️ YOU SHOULD include an Operation-Location header in the response with the absolute URL of the status monitor for the operation.

☑️ YOU SHOULD include the api-version query parameter in the Operation-Location header with the same version passed on the initial request if it is required by the get operation on the status monitor.

☑️ YOU SHOULD allow any valid value of the api-version query parameter to be used in the get operation on the status monitor.

Obtaining status and results of long-running operations

For all long-running operations, the client will issue a GET on a status monitor resource to obtain the current status of the operation.

DO support the GET method on the status monitor endpoint that returns a 200-OK response with the current state of the status monitor.

DO return a status monitor in the response body that conforms with the following structure:

OperationStatus : Object

Property Type Required Description
id string true The unique id of the operation
status string true enum that includes terminal values "Succeeded", "Failed", "Canceled"
error ErrorDetail Error object that describes the error when status is "Failed"
result object Only for POST action-type LRO, the results of the operation when completed successfully
additional
properties
Additional named or dynamic properties of the operation

DO include the id of the operation and any other values needed for the client to form a GET request to the status monitor (e.g. a location path parameter).

DO include a Retry-After header in the response to GET requests to the status monitor if the operation is not complete. The value of this header should be an integer number of seconds to wait before making the next request to the status monitor.

DO include the result property (if any) in the status monitor for a POST action-type long-running operation when the operation completes successfully.

DO NOT include a result property in the status monitor for a long-running operation that is not a POST action-type long-running operation.

DO retain the status monitor resource for some publicly documented period of time (at least 24 hours) after the operation completes.

Bring your own Storage

When implementing your service, it is very common to store and retrieve data and files. When you encounter this scenario, avoid implementing your own storage strategy and instead use Azure Bring Your Own Storage (BYOS). BYOS provides significant benefits to service implementors, e.g. security, an aggressively optimized frontend, uptime, etc. While Azure Managed Storage may be easier to get started with, as your service evolves and matures, BYOS will provide the most flexibility and implementation choices. Further, when designing your APIs, be cognizant of expressing storage concepts and how clients will access your data. For example, if you are working with blobs, then you should not expose the concept of folders, nor do they have extensions.

DO use Azure Bring Your Own Storage.

DO use a blob prefix

DO NOT require a fresh container per operation

Authentication

How you secure and protect the data and files that your service uses will not only affect how consumable your API is, but also, how quickly you can evolve and adapt it. Implementing Role Based Access Control RBAC is the recommended approach. It is important to recognize that any roles defined in RBAC essentially become part of your API contract. For example, changing a role's permissions, e.g. restricting access, could effectively cause existing clients to break, as they may no longer have access to necessary resources.

DO Add RBAC roles for every service operation that requires accessing Storage scoped to the exact permissions.

DO Ensure that RBAC roles are backward compatible, and specifically, do not take away permissions from a role that would break the operation of the service. Any change of RBAC roles that results in a change of the service behavior is considered a breaking change.

Handling 'downstream' errors

It is not uncommon to rely on other services, e.g. storage, when implementing your service. Inevitably, the services you depend on will fail. In these situations, you can include the downstream error code and text in the inner-error of the response body. This provides a consistent pattern for handling errors in the services you depend upon.

DO include error from downstream services as the 'inner-error' section of the response body.

Working with files

Generally speaking, there are two patterns that you will encounter when working with files; single file access, and file collections.

Single file access

Designing an API for accessing a single file, depending on your scenario, is relatively straight forward.

✔️ YOU MAY use a Shared Access Signature SAS to provide access to a single file. SAS is considered the minimum security for files and can be used in lieu of, or in addition to, RBAC.

☑️ YOU SHOULD if using HTTP (not HTTPS) document to users that all information is sent over the wire in clear text.

☑️ YOU SHOULD support managed identity using Azure Storage by default (if using Azure services).

File Versioning

Depending on your requirements, there are scenarios where users of your service will require a specific version of a file. For example, you may need to keep track of configuration changes over time to be able to rollback to a previous state. In these scenarios, you will need to provide a mechanism for accessing a specific version.

DO Enable the customer to provide an ETag to specify a specific version of a file.

File Collections

When your users need to work with multiple files, for example a document translation service, it will be important to provide them access to the collection, and its contents, in a consistent manner. Because there is no industry standard for working with containers, these guidelines will recommend that you leverage Azure Storage. Following the guidelines above, you also want to ensure that you don't expose file system constructs, e.g. folders, and instead use storage constructs, e.g. blob prefixes.

DO When using a Shared Access Signature (SAS), ensure this is assigned to the container and that the permissions apply to the content as well.

DO When using managed identity, ensure the customer has given the proper permissions to access the file container to the service.

A common pattern when working with multiple files is for your service to receive requests that contain the location(s) of files to process, e.g. "input" and a location(s) to place the any files that result from processing, e.g. "output." (Note: the terms "input" and "output" are just examples and terms more relevant to the service domain are more appropriate.)

For example, in a request payload may look similar to the following:

{
"input":{
    "location": "https://mycompany.blob.core.windows.net/documents/english/?<sas token>",
    "delimiter":"/"
    },
"output":{
    "location": "https://mycompany.blob.core.windows.net/documents/spanglish/?<sas token>",
    "delimiter":"/"
    }
}

Note: How the service gets the request body is outside the purview of these guidelines.

Depending on the requirements of the service, there can be any number of "input" and "output" sections, including none. However, for each of the "input" sections the following apply:

DO include a JSON object that has string values for "location" and "delimiter."

DO use a URL to a blob prefix with a container scoped SAS on the end with a minimum of listing and read permissions.

For each of the "output" sections the following apply:

DO use a URL to a blob prefix with a container scoped SAS on the end with a minimum of write permissions

Conditional Requests

When designing an API, you will almost certainly have to manage how your resource is updated. For example, if your resource is a bank account, you will want to ensure that one transaction--say depositing money--does not overwrite a previous transaction. Similarly, it could be very expensive to send a resource to a client. This could be because of its size, network conditions, or a myriad of other reasons. To enable this level of control, services should leverage an ETag header, or "entity tag," which will identify the 'version' or 'instance' of the resource a particular client is working with. An ETag is always set by the service and will enable you to conditionally control how your service responds to requests, enabling you to provide predictable updates and more efficient access.

☑️ YOU SHOULD return an ETag with any operation returning the resource or part of a resource or any update of the resource (whether the resource is returned or not).

☑️ YOU SHOULD use ETags consistently across your API, i.e. if you use an ETag, accept it on all other operations.

You can learn more about conditional requests by reading RFC7232.

Cache Control

One of the more common uses for ETag headers is cache control, also referred to a "conditional GET." This is especially useful when resources are large in size, expensive to compute/calculate, or hard to reach (significant network latency). That is, using the value of the ETag , the server can determine if the resource has changed. If there are no changes, then there is no need to return the resource, as the client already has the most recent version.

Implementing this strategy is relatively straightforward. First, you will return an ETag with a value that uniquely identifies the instance (or version) of the resource. The Computing ETags section provides guidance on how to properly calculate the value of your ETag. In these scenarios, when a request is made by the client an ETag header is returned, with a value that uniquely identifies that specific instance (or version) of the resource. The ETag value can then be sent in subsequent requests as part of the If-None-Match header. This tells the service to compare the ETag that came in with the request, with the latest value that it has calculated. If the two values are the same, then it is not necessary to return the resource to the client--it already has it. If they are different, then the service will return the latest version of the resource, along with the updated ETag value in the header.

☑️ YOU SHOULD implement conditional read strategies

When supporting conditional read strategies: :white_check_mark: DO adhere to the following table for guidance:

GET Request Return code Response
ETag value = If-None-Match value 304-Not Modified no additional information
ETag value != If-None-Match value 200-OK Response body include the serialized value of the resource (typically JSON)

For more control over caching, please refer to the cache-control HTTP header.

Optimistic Concurrency

An ETag should also be used to reflect the create, update, and delete policies of your service. Specifically, you should avoid a "pessimistic" strategy where the 'last write always wins." These can be expensive to build and scale because avoiding the "lost update" problem often requires sophisticated concurrency controls. Instead, implement an "optimistic concurrency" strategy, where the incoming state of the resource is first compared against what currently resides in the service. Optimistic concurrency strategies are implemented through the combination of ETags and the HTTP Request / Response Pattern.

⚠️ YOU SHOULD NOT implement pessimistic update strategies, e.g. last writer wins.

When supporting optimistic concurrency: :white_check_mark: DO adhere to the following table for guidance:

Operation Header Value ETag check Return code Response
PATCH / PUT If-None-Match * check for any version of the resource ('*' is a wildcard used to match anything), if none are found, create the resource. 200-OK or
201-Created
Response header MUST include the new ETag value. Response body SHOULD include the serialized value of the resource (typically JSON).
PATCH / PUT If-None-Match * check for any version of the resource, if one is found, fail the operation 412-Precondition Failed Response body SHOULD return the serialized value of the resource (typically JSON) that was passed along with the request.
PATCH / PUT If-Match value of ETag value of If-Match equals the latest ETag value on the server, confirming that the version of the resource is the most current 200-OK or
201-Created
Response header MUST include the new ETag value. Response body SHOULD include the serialized value of the resource (typically JSON).
PATCH / PUT If-Match value of ETag value of If-Match header DOES NOT equal the latest ETag value on the server, indicating a change has ocurred since after the client fetched the resource 412-Precondition Failed Response body SHOULD return the serialized value of the resource (typically JSON) that was passed along with the request.
DELETE If-Match value of ETag value matches the latest value on the server 204-No Content Response body SHOULD be empty.
DELETE If-Match value of ETag value does NOT match the latest value on the server 412-Preconditioned Failed Response body SHOULD be empty.

Computing ETags

The strategy that you use to compute the ETag depends on its semantic. For example, it is natural, for resources that are inherently versioned, to use the version as the value of the ETag. Another common strategy for determining the value of an ETag is to use a hash of the resource. If a resource is not versioned, and unless computing a hash is prohibitively expensive, this is the preferred mechanism.

☑️ YOU SHOULD, if using a hash strategy, hash the entire resource.

✔️ YOU MAY use or, include, a timestamp in your resource schema. If you do this, the timestamp shouldn't be returned with more than subsecond precision, and it SHOULD be consistent with the data and format returned, e.g. consistent on milliseconds.

✔️ YOU MAY consider Weak ETags if you have a valid scenario for distinguishing between meaningful and cosmetic changes or if it is too expensive to compute a hash.

Distributed Tracing & Telemetry

Azure SDK client guidelines specify that client libraries must send telemetry data through the User-Agent header, X-MS-UserAgent header, and Open Telemetry. Client libraries are required to send telemetry and distributed tracing information on every request. Telemetry information is vital to the effective operation of your service and should be a consideration from the outset of design and implementation efforts.

DO follow the Azure SDK client guidelines for supporting telemetry headers and Open Telemetry.

DO NOT reject a call if you have custom headers you don't understand, and specifically, distributed tracing headers.

Additional References

Final thoughts

These guidelines describe the upfront design considerations, technology building blocks, and common patterns that Azure teams encounter when building an API for their service. There is a great deal of information in them that can be difficult to follow. Fortunately, at Microsoft, there is a team committed to ensuring your success.

The Azure REST API Stewardship board is a collection of dedicated architects that are passionate about helping Azure service teams build interfaces that are intuitive, maintainable, consistent, and most importantly, delight our customers. Because APIs affect nearly all downstream decisions, you are encouraged to reach out to the Stewardship board early in the development process. These architects will work with you to apply these guidelines and identify any hidden pitfalls in your design. For more information on how to part with the Stewardship board, please refer to Considerations for Service Design.