Skip to content

Commit

Permalink
fix broken links
Browse files Browse the repository at this point in the history
  • Loading branch information
TC-MO committed Feb 22, 2024
1 parent 22a88b2 commit 4af0ea2
Show file tree
Hide file tree
Showing 9 changed files with 14 additions and 14 deletions.
4 changes: 2 additions & 2 deletions .github/styles/Apify/Capitalization.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ message: "The word '%s' should always be capitalized."
ignorecase: false
level: error
tokens:
- '\bactor\b'
- '\bactors\b'
- '(?<!\W)\bactor\b'
- '(?<!\W)\bactors\b'
- '(?<!@)\bapify\b(?!-\w+)'
- '(?<!\()\bhttps?://[^\s]*\bapify\b[^\s]*\b(?!\))|(?<!\[)\bhttps?://[^\s]*\bapify\b[^\s]*\b(?!\])'

Expand Down
2 changes: 1 addition & 1 deletion sources/academy/glossary/concepts/http_cookies.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,4 @@ HTTP cookies are small pieces of data sent by the server to the user's web brows
2. To make the website show location-specific data (works for websites where you could set a zip code or country directly on the page, but unfortunately doesn't work for some location-based ads).
3. To make the website less suspicious of the crawler and let the crawler's traffic blend in with regular user traffic.

For local testing, we recommend using the [**EditThisCookie**](https://chrome.google.com/webstore/detail/editthiscookie/fngmhnnpilhplaeedifhccceomclgfbg?hl=en) Chrome extension.
For local testing, we recommend using the [**EditThisCookie**](https://chromewebstore.google.com/detail/editthiscookie/fngmhnnpilhplaeedifhccceomclgfbg) Chrome extension.
2 changes: 1 addition & 1 deletion sources/academy/glossary/tools/modheader.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ slug: /tools/modheader

If you read about [Postman](./postman.md), you might remember that you can use it to modify request headers before sending a request. This is great, but the main problem is that Postman can only make static requests - meaning, it is unable to load JavaScript or any [dynamic content](../concepts/dynamic_pages.md).

[ModHeader](https://chrome.google.com/webstore/detail/modheader/idgpnmonknjnojddfkpgkljpfnnfcklj?hl=en) is a Chrome extension which can be used to modify the HTTP headers of the requests you make with your browser. This means that, for example, if your scraper using a headless browser Puppeteer is being blocked due to an improper **User-Agent** header, you can use ModHeader to test the target website and quickly solve the issue.
[ModHeader](https://chromewebstore.google.com/detail/modheader-modify-http-hea/idgpnmonknjnojddfkpgkljpfnnfcklj) is a Chrome extension which can be used to modify the HTTP headers of the requests you make with your browser. This means that, for example, if your scraper using a headless browser Puppeteer is being blocked due to an improper **User-Agent** header, you can use ModHeader to test the target website and quickly solve the issue.

Check warning on line 16 in sources/academy/glossary/tools/modheader.md

View workflow job for this annotation

GitHub Actions / vale

[vale] sources/academy/glossary/tools/modheader.md#L16

[write-good.Passive] 'be used' may be passive voice. Use active voice if you can.
Raw output
{"message": "[write-good.Passive] 'be used' may be passive voice. Use active voice if you can.", "location": {"path": "sources/academy/glossary/tools/modheader.md", "range": {"start": {"line": 16, "column": 146}}}, "severity": "WARNING"}

Check warning on line 16 in sources/academy/glossary/tools/modheader.md

View workflow job for this annotation

GitHub Actions / vale

[vale] sources/academy/glossary/tools/modheader.md#L16

[write-good.TooWordy] 'modify' is too wordy.
Raw output
{"message": "[write-good.TooWordy] 'modify' is too wordy.", "location": {"path": "sources/academy/glossary/tools/modheader.md", "range": {"start": {"line": 16, "column": 157}}}, "severity": "WARNING"}

Check warning on line 16 in sources/academy/glossary/tools/modheader.md

View workflow job for this annotation

GitHub Actions / vale

[vale] sources/academy/glossary/tools/modheader.md#L16

[write-good.Passive] 'being blocked' may be passive voice. Use active voice if you can.
Raw output
{"message": "[write-good.Passive] 'being blocked' may be passive voice. Use active voice if you can.", "location": {"path": "sources/academy/glossary/tools/modheader.md", "range": {"start": {"line": 16, "column": 309}}}, "severity": "WARNING"}

Check warning on line 16 in sources/academy/glossary/tools/modheader.md

View workflow job for this annotation

GitHub Actions / vale

[vale] sources/academy/glossary/tools/modheader.md#L16

[Microsoft.Terms] Prefer 'personal digital assistant' over 'Agent'.
Raw output
{"message": "[Microsoft.Terms] Prefer 'personal digital assistant' over 'Agent'.", "location": {"path": "sources/academy/glossary/tools/modheader.md", "range": {"start": {"line": 16, "column": 349}}}, "severity": "WARNING"}

Check warning on line 16 in sources/academy/glossary/tools/modheader.md

View workflow job for this annotation

GitHub Actions / vale

[vale] sources/academy/glossary/tools/modheader.md#L16

[write-good.Weasel] 'quickly' is a weasel word!
Raw output
{"message": "[write-good.Weasel] 'quickly' is a weasel word!", "location": {"path": "sources/academy/glossary/tools/modheader.md", "range": {"start": {"line": 16, "column": 418}}}, "severity": "WARNING"}

Check warning on line 16 in sources/academy/glossary/tools/modheader.md

View workflow job for this annotation

GitHub Actions / vale

[vale] sources/academy/glossary/tools/modheader.md#L16

[Microsoft.Adverbs] Consider removing 'quickly'.
Raw output
{"message": "[Microsoft.Adverbs] Consider removing 'quickly'.", "location": {"path": "sources/academy/glossary/tools/modheader.md", "range": {"start": {"line": 16, "column": 418}}}, "severity": "WARNING"}

## The ModHeader interface {#interface}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ await page.click('button + button');

With `page.click()`, Puppeteer and Playwright actually drag the mouse and click, allowing the bot to act more human-like. This is different from programmatically clicking with `Element.click()` in vanilla client-side JavaScript.

Notice that in the Playwright example, we are using a different selector than in the Puppeteer example. This is because Playwright supports [many custom CSS selectors](https://playwright.dev/docs/selectors#text-selector), such as the **has-text** pseudo class. As a rule of thumb, using text selectors is much more preferable to using regular selectors, as they are much less likely to break. If Google makes the sibling above the **I agree** button a `<div>` element instead of a `<button>` element, our `button + button` selector will break. However, the button will always have the text **I agree**; therefore, `button:has-text("I agree")` is more reliable.
Notice that in the Playwright example, we are using a different selector than in the Puppeteer example. This is because Playwright supports [many custom CSS selectors](https://playwright.dev/docs/other-locators#css-matching-by-text), such as the **has-text** pseudo class. As a rule of thumb, using text selectors is much more preferable to using regular selectors, as they are much less likely to break. If Google makes the sibling above the **I agree** button a `<div>` element instead of a `<button>` element, our `button + button` selector will break. However, the button will always have the text **I agree**; therefore, `button:has-text("I agree")` is more reliable.

Check warning on line 58 in sources/academy/webscraping/puppeteer_playwright/page/interacting_with_a_page.md

View workflow job for this annotation

GitHub Actions / vale

[vale] sources/academy/webscraping/puppeteer_playwright/page/interacting_with_a_page.md#L58

[Microsoft.FirstPerson] Use first person (such as ' I ') sparingly.
Raw output
{"message": "[Microsoft.FirstPerson] Use first person (such as ' I ') sparingly.", "location": {"path": "sources/academy/webscraping/puppeteer_playwright/page/interacting_with_a_page.md", "range": {"start": {"line": 58, "column": 1}}}, "severity": "WARNING"}

Check warning on line 58 in sources/academy/webscraping/puppeteer_playwright/page/interacting_with_a_page.md

View workflow job for this annotation

GitHub Actions / vale

[vale] sources/academy/webscraping/puppeteer_playwright/page/interacting_with_a_page.md#L58

[write-good.Weasel] 'many' is a weasel word!
Raw output
{"message": "[write-good.Weasel] 'many' is a weasel word!", "location": {"path": "sources/academy/webscraping/puppeteer_playwright/page/interacting_with_a_page.md", "range": {"start": {"line": 58, "column": 142}}}, "severity": "WARNING"}

Check warning on line 58 in sources/academy/webscraping/puppeteer_playwright/page/interacting_with_a_page.md

View workflow job for this annotation

GitHub Actions / vale

[vale] sources/academy/webscraping/puppeteer_playwright/page/interacting_with_a_page.md#L58

[write-good.Weasel] 'likely' is a weasel word!
Raw output
{"message": "[write-good.Weasel] 'likely' is a weasel word!", "location": {"path": "sources/academy/webscraping/puppeteer_playwright/page/interacting_with_a_page.md", "range": {"start": {"line": 58, "column": 389}}}, "severity": "WARNING"}

Check warning on line 58 in sources/academy/webscraping/puppeteer_playwright/page/interacting_with_a_page.md

View workflow job for this annotation

GitHub Actions / vale

[vale] sources/academy/webscraping/puppeteer_playwright/page/interacting_with_a_page.md#L58

[write-good.TooWordy] 'However' is too wordy.
Raw output
{"message": "[write-good.TooWordy] 'However' is too wordy.", "location": {"path": "sources/academy/webscraping/puppeteer_playwright/page/interacting_with_a_page.md", "range": {"start": {"line": 58, "column": 557}}}, "severity": "WARNING"}

Check warning on line 58 in sources/academy/webscraping/puppeteer_playwright/page/interacting_with_a_page.md

View workflow job for this annotation

GitHub Actions / vale

[vale] sources/academy/webscraping/puppeteer_playwright/page/interacting_with_a_page.md#L58

[write-good.TooWordy] 'therefore' is too wordy.
Raw output
{"message": "[write-good.TooWordy] 'therefore' is too wordy.", "location": {"path": "sources/academy/webscraping/puppeteer_playwright/page/interacting_with_a_page.md", "range": {"start": {"line": 58, "column": 616}}}, "severity": "WARNING"}

> If you're not already familiar with CSS selectors and how to find them, we recommend referring to [this lesson](../../web_scraping_for_beginners/data_extraction/using_devtools.md) in the **Web scraping for beginners** course.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ In the future, output schema will also help with strict output data format valid

An Actor's output schema defines the structure and both API and visual representation of data produced by an Actor. Output configuration files have to be located in the `.actor` folder in the Actor's root directory.

## How to organize files in the .actor folder: two options
## How to organize files in the .actor folder: Two options

**A)** all config options are being set in a **.actor/actor.json** file, e.g.:

Expand Down Expand Up @@ -175,7 +175,7 @@ Let's say we are going to use a single file to set up an Actor's output tab UI.
}
```

The template above defines the configuration for the default dataset output view. Under the **views** property, there is one view with the title **Overview**. The view configuration consists of two basic steps: 1) set up how to fetch the data, aka **transformation,** and 2) set up how to **display** the data fetched in step 1). The default behaviour is that the Output tab UI table will display **all the fields** from `transformation.fields` **in that same order**. So, theoretically, there should be no need to set up `[**display.properties**](http://display.properties)` at all. However, it can be customized in case it is visually worth setting up some specific display format or column labels. The customization is carried out by using one of the `transformation.fields` names inside `display.properties` and overriding either the label or the format, as demonstrated in the basic template above.
The template above defines the configuration for the default dataset output view. Under the **views** property, there is one view with the title **Overview**. The view configuration consists of two basic steps: 1) set up how to fetch the data, aka **transformation,** and 2) set up how to **display** the data fetched in step 1). The default behaviour is that the Output tab UI table will display **all the fields** from `transformation.fields` **in that same order**. Theoretically, there should be no need to set up `display.properties` at all. However, it can be customized in case it is visually worth setting up some specific display format or column labels. The customization is carried out by using one of the `transformation.fields` names inside `display.properties` and overriding either the label or the format, as demonstrated in the basic template above.

Check warning on line 178 in sources/platform/actors/development/actor_definition/output_schema.md

View workflow job for this annotation

GitHub Actions / vale

[vale] sources/platform/actors/development/actor_definition/output_schema.md#L178

[write-good.TooWordy] 'However' is too wordy.
Raw output
{"message": "[write-good.TooWordy] 'However' is too wordy.", "location": {"path": "sources/platform/actors/development/actor_definition/output_schema.md", "range": {"start": {"line": 178, "column": 548}}}, "severity": "WARNING"}

Check warning on line 178 in sources/platform/actors/development/actor_definition/output_schema.md

View workflow job for this annotation

GitHub Actions / vale

[vale] sources/platform/actors/development/actor_definition/output_schema.md#L178

[write-good.Passive] 'be customized' may be passive voice. Use active voice if you can.
Raw output
{"message": "[write-good.Passive] 'be customized' may be passive voice. Use active voice if you can.", "location": {"path": "sources/platform/actors/development/actor_definition/output_schema.md", "range": {"start": {"line": 178, "column": 564}}}, "severity": "WARNING"}

Check warning on line 178 in sources/platform/actors/development/actor_definition/output_schema.md

View workflow job for this annotation

GitHub Actions / vale

[vale] sources/platform/actors/development/actor_definition/output_schema.md#L178

[write-good.TooWordy] 'it is' is too wordy.
Raw output
{"message": "[write-good.TooWordy] 'it is' is too wordy.", "location": {"path": "sources/platform/actors/development/actor_definition/output_schema.md", "range": {"start": {"line": 178, "column": 586}}}, "severity": "WARNING"}

Check warning on line 178 in sources/platform/actors/development/actor_definition/output_schema.md

View workflow job for this annotation

GitHub Actions / vale

[vale] sources/platform/actors/development/actor_definition/output_schema.md#L178

[write-good.Passive] 'is carried' may be passive voice. Use active voice if you can.
Raw output
{"message": "[write-good.Passive] 'is carried' may be passive voice. Use active voice if you can.", "location": {"path": "sources/platform/actors/development/actor_definition/output_schema.md", "range": {"start": {"line": 178, "column": 683}}}, "severity": "WARNING"}

A 2-step configuration (transform & display) was implemented to provide a way to fetch data in the format presented in both API and UI consistently. Consistency between API data and UI data is crucial for Actor end-users for them to experience the same results in both API and UI. Thus for the best end-user experience, we recommend overriding as few display properties as possible.

Expand Down Expand Up @@ -272,7 +272,7 @@ Example of an Actor output UI generated using basic template:

### Nested structures

The most frequently used data formats present the data in a tabular format (Output tab table, Excel, CSV). In case an Actor produces nested JSON structures, there is a need to transform the nested data into a flat tabular format. There are three ways to flatten the data:
The most frequently used data formats present the data in a tabular format (Output tab table, Excel, CSV). In case an Actor produces nested JSON structures, there is a need to transform the nested data into a flat tabular format. You can flatten the data in following ways:

**1)** use `transformation.flatten` to flatten the nested structure of specified fields. Flatten transforms the nested object into a flat structure. e.g. with `flatten:["foo"]`, the object `{"foo": {"bar": "hello"}}` is turned into `{"foo.bar": "hello"}`. Once the structure is flattened, it is necessary to use the flattened property name in both `transformation.fields` and [`display.properties`](http://display.properties), otherwise, fields might not be fetched or configured properly in the UI visualization.

Expand All @@ -287,7 +287,7 @@ The most frequently used data formats present the data in a tabular format (Outp
| Property | Type | Required | Description |
| ------------------ | ---------------------------- | -------- | -------------------------------------------------------------------------------------------------- |
| actorSpecification | integer | true | Specifies the version of dataset schema <br/>structure document. <br/>Currently only version 1 is available. |
| fields | JSONSchema compatible object | true | Schema of one dataset object. <br/>Use JsonSchema Draft 2020-12 or <br/>other compatible formats. |
| fields | JSONSchema compatible object | true | Schema of one dataset object. <br/>Use JsonSchema Draft 202012 or <br/>other compatible formats. |
| views | DatasetView object | true | An object with a description of an API <br/>and UI views. |

### DatasetView object definition
Expand Down
2 changes: 1 addition & 1 deletion sources/platform/integrations/webhooks/ad_hoc_webhooks.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ async def main():
</TabItem>
</Tabs>

To learn more, see the [JavaScript SDK documentation](/sdk/js/api/apify/class/Actor#addWebhook) or the [Python SDK documentation](/sdk/python/api/apify/class/Actor#add_webhook).
To learn more, see the [JavaScript SDK documentation](/sdk/js/reference/class/Actor#addWebhook) or the [Python SDK documentation](/sdk/python/api/apify/class/Actor#add_webhook).

To ensure that duplicate ad-hoc webhooks won't get created in the case of Actor restart, you can use the idempotency key parameter. The idempotency key must be unique across all the webhooks of a user so that only one webhook gets created for a given value. You can use, for example, the Actor run ID as an idempotency key:

Expand Down
4 changes: 2 additions & 2 deletions sources/platform/storage/dataset.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,7 @@ To view or download a dataset:
2. Select the format & configure other options if desired in **Export dataset** section.
3. Click **Download**.

Utilize the **Actions** menu to modify the dataset's name, which also affects its [retention period](./index#data-retention-data-retention), and to adjust [access rights](../collaboration/index.md). The **API** button allows you to explore and test the dataset's [API endpoints](/api/v2#/reference/datasets).
Utilize the **Actions** menu to modify the dataset's name, which also affects its [retention period](./usage.md#data-retention-data-retention), and to adjust [access rights](../collaboration/index.md). The **API** button allows you to explore and test the dataset's [API endpoints](/api/v2#/reference/datasets).

Check warning on line 50 in sources/platform/storage/dataset.md

View workflow job for this annotation

GitHub Actions / vale

[vale] sources/platform/storage/dataset.md#L50

[write-good.TooWordy] 'Utilize' is too wordy.
Raw output
{"message": "[write-good.TooWordy] 'Utilize' is too wordy.", "location": {"path": "sources/platform/storage/dataset.md", "range": {"start": {"line": 50, "column": 1}}}, "severity": "WARNING"}

Check warning on line 50 in sources/platform/storage/dataset.md

View workflow job for this annotation

GitHub Actions / vale

[vale] sources/platform/storage/dataset.md#L50

[write-good.TooWordy] 'modify' is too wordy.
Raw output
{"message": "[write-good.TooWordy] 'modify' is too wordy.", "location": {"path": "sources/platform/storage/dataset.md", "range": {"start": {"line": 50, "column": 33}}}, "severity": "WARNING"}

![Datasets detail view](./images/datasets-detail.png)

Expand Down Expand Up @@ -178,7 +178,7 @@ async def main():
hotel_and_cafe_data = await dataset.get_data(fields=['hotel', 'cafe'])
```

For more information, visit our [Python SDK documentation](/sdk/python/docs/guides/result-storage#dataset) and the `Dataset` class's [API reference](/sdk/python/reference/class/Dataset) for details on managing datasets with the Python SDK.
For more information, visit our [Python SDK documentation](/sdk/python/docs/concepts/storages#working-with-datasets) and the `Dataset` class's [API reference](/sdk/python/reference/class/Dataset) for details on managing datasets with the Python SDK.

### JavaScript API client {#javascript-api-client}

Expand Down
2 changes: 1 addition & 1 deletion sources/platform/storage/key_value_store.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ You can access key-value stores through several methods

* [Apify Console](https://console.apify.com/storage?tab=keyValueStores) - provides an easy-to-understand interface.
* [JavaScript SDK](/sdk/js/docs/guides/result-storage#key-value-store) - when building your own JavaScript Actor.
* [Python SDK](sdk/python/docs/concepts/storages#working-with-key-value-stores) - when building your own Python Actor.
* [Python SDK](/sdk/python/docs/concepts/storages#working-with-key-value-stores) - when building your own Python Actor.
* [JavaScript API client](/api/client/js/reference/class/KeyValueStoreClient) - to access your key-value stores from any Node.js application.
* [Python API client](/api/client/python/reference/class/KeyValueStoreClient) - to access your key-value stores from any Python application.
* [Apify API](/api/v2#/reference/key-value-stores/get-items) - for accessing your key-value stores programmatically.
Expand Down
2 changes: 1 addition & 1 deletion sources/platform/storage/request_queue.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,7 @@ async def main():
await queue.drop()
```

Check out the [Python SDK documentation](/sdk/python/docs/guides/request-storage#request-queue) and the `RequestQueue` class's [API reference](/sdk/python/reference/class/RequestQueue) for details on managing your request queues with the Python SDK.
Check out the [Python SDK documentation](/sdk/python/docs/concepts/storages#working-with-request-queues) and the `RequestQueue` class's [API reference](/sdk/python/reference/class/RequestQueue) for details on managing your request queues with the Python SDK.

### JavaScript API client {#javascript-api-client}

Expand Down

0 comments on commit 4af0ea2

Please sign in to comment.