From 15b09239b2e1fa4f475cbe3b830fa6c8eb320e87 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?J=C3=BAlia=20Rabello?= <77292838+julia-rabello@users.noreply.github.com> Date: Fri, 27 Sep 2024 16:24:57 -0300 Subject: [PATCH 1/6] create extracting-data-from-master-data-with-search-and-scroll.md --- ...from-master-data-with-search-and-scroll.md | 56 +++++++++++++++++++ 1 file changed, 56 insertions(+) create mode 100644 docs/localization/extracting-data-from-master-data-with-search-and-scroll.md diff --git a/docs/localization/extracting-data-from-master-data-with-search-and-scroll.md b/docs/localization/extracting-data-from-master-data-with-search-and-scroll.md new file mode 100644 index 0000000000..e41cb9e563 --- /dev/null +++ b/docs/localization/extracting-data-from-master-data-with-search-and-scroll.md @@ -0,0 +1,56 @@ +--- +title: "Extracting data from Master Data with search and scroll" +slug: "extracting-data-from-master-data-with-search-and-scroll" +hidden: false +createdAt: "2024-09-27T10:00:00.000Z" +updatedAt: "2024-09-27T10:00:00.000Z" +--- + +In this guide, you will learn how to extract data from Master Data using the search and scroll endpoints, including when to use each route, how to optimize queries, and best practices. + +>ℹ In Master Data v1, you can export data directly from the interface. See [Exporting data from Master Data v1](https://help.vtex.com/en/tutorial/exporting-data--tutorials_1125) for more information. + +## Search + +The search route is ideal when you need to find a specific set of documents within your store. It is particularly useful for paginated queries, where you want to retrieve up to 10000 documents in small chunks over multiple requests. Each page is limited to 100 documents. + +>ℹ When paginating, the `_sort` parameter is recommended. The API by itself does not guarantee order, so without a defined `_sort`, documents may return duplicate or not return at the expected page. + +See the Search endpoint reference depending on the Master Data version you are using: + +* [Master Data API v1 - Search](https://developers.vtex.com/docs/api-reference/masterdata-api#get-/api/dataentities/-acronym-/search) +* [Master Data API v2 - Search](https://developers.vtex.com/docs/api-reference/master-data-api-v2#get-/api/dataentities/-dataEntityName-/search) + +### Best practices + +When using the search endpoint, these best practices will help enhance your data retrieval process: + +* **Apply filters to narrow your search**: Improve performance by reducing the number of documents returned. This speeds up the query and ensures that your requests are more efficient. + +* **Use exact values for queries instead of wildcards (`*`):** Heavy usage of wildcards may be subject to temporary blocks. + +* **Avoid large datasets:** If you are querying many documents, break your query into smaller intervals. + +## Scroll + +The scroll route is designed for extensive data retrieval, especially when integrating Master Data with external systems. It is the best choice if you need to query the entire database or when dealing with over 10000 documents. + +See the Scroll endpoint reference depending on the Master Data version you are using: + +* [Master Data API v1 - Scroll](https://developers.vtex.com/docs/api-reference/masterdata-api#get-/api/dataentities/-acronym-/scroll) +* [Master Data API v2 - Scroll](https://developers.vtex.com/docs/api-reference/master-data-api-v2#get-/api/dataentities/-dataEntityName-/scroll) + +Your first scroll request will return a token in the `X-VTEX-MD-TOKEN` response header. Inform this value in the `_token` query parameter for your next requests until you receive an empty list, indicating that all documents have been retrieved. + +### Scroll best practices + +To ensure efficient and reliable data retrieval, follow these strategies when using the scroll endpoint: + +* **Implement filters to divide the request into smaller batches,** reducing the likelihood of timeouts. For example, you might filter by creation date and process data month by month. Smaller batches are also easier to reprocess if a timeout occurs, making your operation more resilient. +* **Run up to 10 scrolls simultaneously per account.** Limiting the number of parallel scrolls helps prevent errors and timeouts. By using filters to create smaller batches and parallelizing these batches in a controlled manner, you can speed up data retrieval while reducing the risk of overloading the account. + +>⚠️ **Scroll behavior and limitations** +> +> * **Each scroll operation allows only one query for the duration of the token.** This means that you cannot change a scroll’s query by changing parameters after the first request: you can navigate pages of the original first request until the token expires, or initiate other scrolls (up to 10 simultaneously). +> * If Master Data stops receiving requests with the scroll `X-VTEX-MD-TOKEN` token, it will expire in **20 minutes**. After that, you can make new scroll requests, limited to 10 simultaneous scrolls. +> * The maximum number of documents per scroll request is **1000**. From 841b8ef472e89563fe009a63c53ace6beebc06f6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?J=C3=BAlia=20Rabello?= <77292838+julia-rabello@users.noreply.github.com> Date: Fri, 27 Sep 2024 16:29:35 -0300 Subject: [PATCH 2/6] add callouts and see more --- ...ing-data-from-master-data-with-search-and-scroll.md | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/docs/localization/extracting-data-from-master-data-with-search-and-scroll.md b/docs/localization/extracting-data-from-master-data-with-search-and-scroll.md index e41cb9e563..8fdcc6b3e4 100644 --- a/docs/localization/extracting-data-from-master-data-with-search-and-scroll.md +++ b/docs/localization/extracting-data-from-master-data-with-search-and-scroll.md @@ -4,6 +4,8 @@ slug: "extracting-data-from-master-data-with-search-and-scroll" hidden: false createdAt: "2024-09-27T10:00:00.000Z" updatedAt: "2024-09-27T10:00:00.000Z" +seeAlso: + - "/docs/guides/pagination-in-the-master-data-api" --- In this guide, you will learn how to extract data from Master Data using the search and scroll endpoints, including when to use each route, how to optimize queries, and best practices. @@ -14,7 +16,7 @@ In this guide, you will learn how to extract data from Master Data using the sea The search route is ideal when you need to find a specific set of documents within your store. It is particularly useful for paginated queries, where you want to retrieve up to 10000 documents in small chunks over multiple requests. Each page is limited to 100 documents. ->ℹ When paginating, the `_sort` parameter is recommended. The API by itself does not guarantee order, so without a defined `_sort`, documents may return duplicate or not return at the expected page. +>ℹ When paginating, the `_sort` parameter is recommended. The API does not guarantee a specific order by default; therefore, omitting the `_sort` parameter may lead to duplicate documents or return unexpected pages. See the Search endpoint reference depending on the Master Data version you are using: @@ -26,11 +28,11 @@ See the Search endpoint reference depending on the Master Data version you are u When using the search endpoint, these best practices will help enhance your data retrieval process: * **Apply filters to narrow your search**: Improve performance by reducing the number of documents returned. This speeds up the query and ensures that your requests are more efficient. - * **Use exact values for queries instead of wildcards (`*`):** Heavy usage of wildcards may be subject to temporary blocks. - * **Avoid large datasets:** If you are querying many documents, break your query into smaller intervals. +>ℹ Learn more about [Search pagination in the Master Data API](https://developers.vtex.com/docs/guides/pagination-in-the-master-data-api#search-pagination). + ## Scroll The scroll route is designed for extensive data retrieval, especially when integrating Master Data with external systems. It is the best choice if you need to query the entire database or when dealing with over 10000 documents. @@ -42,6 +44,8 @@ See the Scroll endpoint reference depending on the Master Data version you are u Your first scroll request will return a token in the `X-VTEX-MD-TOKEN` response header. Inform this value in the `_token` query parameter for your next requests until you receive an empty list, indicating that all documents have been retrieved. +>ℹ Learn more about [Scroll pagination in the Master Data API](https://developers.vtex.com/docs/guides/pagination-in-the-master-data-api#scroll-pagination). + ### Scroll best practices To ensure efficient and reliable data retrieval, follow these strategies when using the scroll endpoint: From ae36a6a44da5738d3052e8a8e733027e3dcca094 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?J=C3=BAlia=20Rabello?= <77292838+julia-rabello@users.noreply.github.com> Date: Mon, 30 Sep 2024 14:38:40 -0300 Subject: [PATCH 3/6] add excerpt --- .../extracting-data-from-master-data-with-search-and-scroll.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/localization/extracting-data-from-master-data-with-search-and-scroll.md b/docs/localization/extracting-data-from-master-data-with-search-and-scroll.md index 8fdcc6b3e4..bf74b27591 100644 --- a/docs/localization/extracting-data-from-master-data-with-search-and-scroll.md +++ b/docs/localization/extracting-data-from-master-data-with-search-and-scroll.md @@ -4,6 +4,7 @@ slug: "extracting-data-from-master-data-with-search-and-scroll" hidden: false createdAt: "2024-09-27T10:00:00.000Z" updatedAt: "2024-09-27T10:00:00.000Z" +excerpt: "Learn how to extract data from Master Data using the search and scroll endpoints, with best practices for optimizing queries and handling large datasets." seeAlso: - "/docs/guides/pagination-in-the-master-data-api" --- From 960a921980b7a09e5e922856de64ae91e77163b3 Mon Sep 17 00:00:00 2001 From: "George B. de Lima" <106821144+GeorgeLimaDev@users.noreply.github.com> Date: Thu, 17 Oct 2024 03:49:02 -0900 Subject: [PATCH 4/6] New translations extracting-data-from-master-data-with-search-and-scroll.md (English, United States) --- ...from-master-data-with-search-and-scroll.md | 30 +++++++++---------- 1 file changed, 15 insertions(+), 15 deletions(-) diff --git a/docs/localization/extracting-data-from-master-data-with-search-and-scroll.md b/docs/localization/extracting-data-from-master-data-with-search-and-scroll.md index bf74b27591..b2ed879f18 100644 --- a/docs/localization/extracting-data-from-master-data-with-search-and-scroll.md +++ b/docs/localization/extracting-data-from-master-data-with-search-and-scroll.md @@ -4,9 +4,9 @@ slug: "extracting-data-from-master-data-with-search-and-scroll" hidden: false createdAt: "2024-09-27T10:00:00.000Z" updatedAt: "2024-09-27T10:00:00.000Z" -excerpt: "Learn how to extract data from Master Data using the search and scroll endpoints, with best practices for optimizing queries and handling large datasets." +excerpt: "Learn how to extract data from Master Data using the search and scroll endpoints and best practices for optimizing queries and handling large datasets." seeAlso: - - "/docs/guides/pagination-in-the-master-data-api" +- "/docs/guides/pagination-in-the-master-data-api" --- In this guide, you will learn how to extract data from Master Data using the search and scroll endpoints, including when to use each route, how to optimize queries, and best practices. @@ -15,18 +15,18 @@ In this guide, you will learn how to extract data from Master Data using the sea ## Search -The search route is ideal when you need to find a specific set of documents within your store. It is particularly useful for paginated queries, where you want to retrieve up to 10000 documents in small chunks over multiple requests. Each page is limited to 100 documents. +The search route is ideal when you need to find a specific set of documents within your store. It is particularly useful for paginated queries, where you want to retrieve up to 10,000 documents in small chunks over multiple requests and each page is limited to 100 documents. ->ℹ When paginating, the `_sort` parameter is recommended. The API does not guarantee a specific order by default; therefore, omitting the `_sort` parameter may lead to duplicate documents or return unexpected pages. +>ℹ When paginating, the `_sort` parameter is recommended. The API does not guarantee a specific order by default. Therefore, omitting the `_sort` parameter may lead to duplicate documents or return unexpected pages. -See the Search endpoint reference depending on the Master Data version you are using: +Check the search endpoint reference for your Master Data version: -* [Master Data API v1 - Search](https://developers.vtex.com/docs/api-reference/masterdata-api#get-/api/dataentities/-acronym-/search) +* [Master Data API v1 - Search](https://developers.vtex.com/docs/api-reference/masterdata-api#get-/api/dataentities/-acronym-/search) * [Master Data API v2 - Search](https://developers.vtex.com/docs/api-reference/master-data-api-v2#get-/api/dataentities/-dataEntityName-/search) ### Best practices -When using the search endpoint, these best practices will help enhance your data retrieval process: +When using the search endpoint, these best practices will help optimize your data retrieval process: * **Apply filters to narrow your search**: Improve performance by reducing the number of documents returned. This speeds up the query and ensures that your requests are more efficient. * **Use exact values for queries instead of wildcards (`*`):** Heavy usage of wildcards may be subject to temporary blocks. @@ -36,14 +36,14 @@ When using the search endpoint, these best practices will help enhance your data ## Scroll -The scroll route is designed for extensive data retrieval, especially when integrating Master Data with external systems. It is the best choice if you need to query the entire database or when dealing with over 10000 documents. +The scroll route is designed for extensive data retrieval, especially when integrating Master Data with external systems. It is the best choice if you need to query the entire database or deal with over 10,000 documents. -See the Scroll endpoint reference depending on the Master Data version you are using: +Check the scroll endpoint reference for your Master Data version: -* [Master Data API v1 - Scroll](https://developers.vtex.com/docs/api-reference/masterdata-api#get-/api/dataentities/-acronym-/scroll) +* [Master Data API v1 - Scroll](https://developers.vtex.com/docs/api-reference/masterdata-api#get-/api/dataentities/-acronym-/scroll) * [Master Data API v2 - Scroll](https://developers.vtex.com/docs/api-reference/master-data-api-v2#get-/api/dataentities/-dataEntityName-/scroll) -Your first scroll request will return a token in the `X-VTEX-MD-TOKEN` response header. Inform this value in the `_token` query parameter for your next requests until you receive an empty list, indicating that all documents have been retrieved. +Your first scroll request will return a token in the `X-VTEX-MD-TOKEN` response header. Submit this value in the `_token` query parameter for your next requests until you receive an empty list, which indicates that all documents have been retrieved. >ℹ Learn more about [Scroll pagination in the Master Data API](https://developers.vtex.com/docs/guides/pagination-in-the-master-data-api#scroll-pagination). @@ -51,11 +51,11 @@ Your first scroll request will return a token in the `X-VTEX-MD-TOKEN` response To ensure efficient and reliable data retrieval, follow these strategies when using the scroll endpoint: -* **Implement filters to divide the request into smaller batches,** reducing the likelihood of timeouts. For example, you might filter by creation date and process data month by month. Smaller batches are also easier to reprocess if a timeout occurs, making your operation more resilient. -* **Run up to 10 scrolls simultaneously per account.** Limiting the number of parallel scrolls helps prevent errors and timeouts. By using filters to create smaller batches and parallelizing these batches in a controlled manner, you can speed up data retrieval while reducing the risk of overloading the account. +* **Implement filters to divide the request into smaller batches:** This reduces the likelihood of timeouts. For example, you might filter by created date and process data month by month. Smaller batches are also easier to reprocess if a timeout occurs, making your operation more resilient. +* **Run up to 10 scrolls simultaneously per account:** Limiting the number of parallel scrolls helps prevent errors and timeouts. By using filters to create smaller batches and parallelizing these batches in a controlled manner, you can speed up data retrieval while reducing the risk of overloading the account. >⚠️ **Scroll behavior and limitations** > -> * **Each scroll operation allows only one query for the duration of the token.** This means that you cannot change a scroll’s query by changing parameters after the first request: you can navigate pages of the original first request until the token expires, or initiate other scrolls (up to 10 simultaneously). +> * **Each scroll operation allows only one query for the duration of the token.** This means that you cannot change a scroll’s query by changing parameters after the first request. You can navigate pages of the original first request until the token expires, or initiate other scrolls (up to 10 simultaneously). > * If Master Data stops receiving requests with the scroll `X-VTEX-MD-TOKEN` token, it will expire in **20 minutes**. After that, you can make new scroll requests, limited to 10 simultaneous scrolls. -> * The maximum number of documents per scroll request is **1000**. +> * The maximum number of documents per scroll request is **1,000**. From 80a72efddefe7ddd2ce27015f00528c17ac24d76 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?J=C3=BAlia=20Rabello?= <77292838+julia-rabello@users.noreply.github.com> Date: Thu, 17 Oct 2024 10:38:48 -0300 Subject: [PATCH 5/6] add removed space --- .../extracting-data-from-master-data-with-search-and-scroll.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/localization/extracting-data-from-master-data-with-search-and-scroll.md b/docs/localization/extracting-data-from-master-data-with-search-and-scroll.md index b2ed879f18..8210c40f0c 100644 --- a/docs/localization/extracting-data-from-master-data-with-search-and-scroll.md +++ b/docs/localization/extracting-data-from-master-data-with-search-and-scroll.md @@ -6,7 +6,7 @@ createdAt: "2024-09-27T10:00:00.000Z" updatedAt: "2024-09-27T10:00:00.000Z" excerpt: "Learn how to extract data from Master Data using the search and scroll endpoints and best practices for optimizing queries and handling large datasets." seeAlso: -- "/docs/guides/pagination-in-the-master-data-api" + - "/docs/guides/pagination-in-the-master-data-api" --- In this guide, you will learn how to extract data from Master Data using the search and scroll endpoints, including when to use each route, how to optimize queries, and best practices. From fb80dfe464869ac1a75f356a9e71b492bd928b23 Mon Sep 17 00:00:00 2001 From: julia-rabello <77292838+julia-rabello@users.noreply.github.com> Date: Thu, 17 Oct 2024 10:44:24 -0300 Subject: [PATCH 6/6] move file --- .../extracting-data-from-master-data-with-search-and-scroll.md | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename docs/{localization => guides/Master-Data/guides}/extracting-data-from-master-data-with-search-and-scroll.md (100%) diff --git a/docs/localization/extracting-data-from-master-data-with-search-and-scroll.md b/docs/guides/Master-Data/guides/extracting-data-from-master-data-with-search-and-scroll.md similarity index 100% rename from docs/localization/extracting-data-from-master-data-with-search-and-scroll.md rename to docs/guides/Master-Data/guides/extracting-data-from-master-data-with-search-and-scroll.md