Skip to content

Commit

Permalink
docs: fix all broken links and enforce we dont add new ones (#251)
Browse files Browse the repository at this point in the history
  • Loading branch information
B4nan authored Nov 7, 2023
1 parent 8ac8a2c commit fe2b517
Show file tree
Hide file tree
Showing 40 changed files with 116 additions and 90 deletions.
36 changes: 31 additions & 5 deletions website/docusaurus.config.js
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,9 @@ module.exports = {
projectName: 'apify-sdk-js',
favicon: 'img/favicon.svg',
onBrokenLinks:
/** @type {import('@docusaurus/types').ReportingSeverity} */ ('warn'),
/** @type {import('@docusaurus/types').ReportingSeverity} */ ('throw'),
onBrokenMarkdownLinks:
/** @type {import('@docusaurus/types').ReportingSeverity} */ ('warn'),
/** @type {import('@docusaurus/types').ReportingSeverity} */ ('throw'),
themes: [
[
'@apify/docs-theme',
Expand Down Expand Up @@ -68,11 +68,35 @@ module.exports = {
'className': 'navbar__item', // fixes margin around dropdown - hackish, should be fixed in theme
'data-api-links': JSON.stringify([
'reference/next',
...versions.map((version, i) => (i === 0 ? 'reference' : `reference/${version}`)),
...versions.map((version, i) => {
if (i === 0) {
return 'reference';
}

if (+version < 3) {
return `docs/${version}/api/apify`;
}

return `reference/${version}`;
}),
]),
'dropdownItemsBefore': [],
'dropdownItemsAfter': [],
},
// {
// type: 'docsVersionDropdown',
// position: 'left',
// dropdownItemsAfter: [
// {
// href: 'https://sdk.apify.com/docs/guides/getting-started',
// label: '2.2',
// },
// {
// href: 'https://sdk.apify.com/docs/1.3.1/guides/getting-started',
// label: '1.3',
// },
// ],
// },
],
},
},
Expand All @@ -84,8 +108,10 @@ module.exports = {
/** @type {import('@docusaurus/preset-classic').Options} */
({
docs: {
showLastUpdateAuthor: true,
showLastUpdateTime: true,
// Docusaurus shows the author and date of last commit to entire repo, which doesn't make sense,
// so let's just disable showing author and last modification
showLastUpdateAuthor: false,
showLastUpdateTime: false,
path: '../docs',
sidebarPath: './sidebars.js',
rehypePlugins: [externalLinkProcessor],
Expand Down
8 changes: 6 additions & 2 deletions website/src/components/ApiLink.jsx
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,13 @@ const ApiLink = ({ to, children }) => {
const { version, isLast } = useDocsVersion();
const { siteConfig } = useDocusaurusContext();

if (to.toString().startsWith('apify/')) {
to = to.toString().substring('apify/'.length);
}

if (siteConfig.presets[0][1].docs.disableVersioning) {
return (
<Link to={`/api/${to}`}>{children}</Link>
<Link to={`/reference/${to}`}>{children}</Link>
);
}

Expand All @@ -23,7 +27,7 @@ const ApiLink = ({ to, children }) => {
}

return (
<Link to={`/api/${versionSlug}${to}`}>{children}</Link>
<Link to={`/reference/${versionSlug}${to}`}>{children}</Link>
);
};

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -67,8 +67,7 @@ export default function DocsVersionDropdownNavbarItem({
? translate({
id: 'theme.navbar.mobileVersionsDropdown.label',
message: 'Versions',
description:
'The label for the navbar versions dropdown on mobile view',
description: 'The label for the navbar versions dropdown on mobile view',
})
: dropdownVersion.label;
let dropdownTo = mobile && items.length > 1
Expand Down
4 changes: 2 additions & 2 deletions website/versioned_docs/version-1.3/examples/basic_crawler.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,11 @@ id: basic-crawler
---

This is the most bare-bones example of the Apify SDK, which demonstrates some of its building blocks such as the
[`BasicCrawler`](/docs/api/basic-crawler). You probably don't need to go this deep though, and it would be better to start with one of the full
[`BasicCrawler`](/docs/1.3/api/basic-crawler). You probably don't need to go this deep though, and it would be better to start with one of the full
featured crawlers like [`CheerioCrawler`](https://sdk.apify.com/docs/examples/cheerio-crawler) or
[`PlaywrightCrawler`](https://sdk.apify.com/docs/examples/playwright-crawler).

The script simply downloads several web pages with plain HTTP requests using the [`Apify.utils.requestAsBrowser()`](/docs/api/utils#requestasbrowser)
The script simply downloads several web pages with plain HTTP requests using the [`Apify.utils.requestAsBrowser()`](/docs/1.3/api/utils#requestasbrowser)
convenience function and stores their raw HTML and URL in the default dataset. In local configuration, the data will be stored as JSON files in
`./apify_storage/datasets/default`.

Expand Down
4 changes: 2 additions & 2 deletions website/versioned_docs/version-1.3/examples/call_actor.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ title: Call actor
id: call-actor
---

This example demonstrates how to start an Apify actor using [`Apify.call()`](/docs/api/apify#call) and how to call the Apify API using
[`Apify.client`](/docs/api/apify#client). The script gets a random weird word and its explanation from [randomword.com](https://randomword.com/) and
This example demonstrates how to start an Apify actor using [`Apify.call()`](/docs/1.3/api/apify#call) and how to call the Apify API using
[`Apify.client`](/docs/1.3/api/apify#client). The script gets a random weird word and its explanation from [randomword.com](https://randomword.com/) and
sends it to your email using the [`apify/send-mail`](https://apify.com/apify/send-mail) actor.

To make the example work, you'll need an [Apify account](https://my.apify.com/). Go to the
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: Cheerio crawler
id: cheerio-crawler
---

This example demonstrates how to use [`CheerioCrawler`](/docs/api/cheerio-crawler) to crawl a list of URLs from an external file, load each URL using
This example demonstrates how to use [`CheerioCrawler`](/docs/1.3/api/cheerio-crawler) to crawl a list of URLs from an external file, load each URL using
a plain HTTP request, parse the HTML using the [Cheerio library](https://www.npmjs.com/package/cheerio) and extract some data from it: the page title
and all `h1` tags.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: Crawl a website with relative links
id: crawl-relative-links
---

If a website uses relative links, [`CheerioCrawler`](/docs/api/cheerio-crawler) and `Apify.enqueueLinks()` may have trouble following them. This is
If a website uses relative links, [`CheerioCrawler`](/docs/1.3/api/cheerio-crawler) and `Apify.enqueueLinks()` may have trouble following them. This is
why it is important to set the `baseUrl` property within `Apify.enqueueLinks()` to `request.loadedUrl`:

```javascript
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: Crawl a single URL
id: crawl-single-url
---

This example uses the [`Apify.utils.requestAsBrowser()`](/docs/api/utils#utilsrequestasbrowseroptions) function to grab the HTML of a web page.
This example uses the [`Apify.utils.requestAsBrowser()`](/docs/1.3/api/utils#utilsrequestasbrowseroptions) function to grab the HTML of a web page.

```javascript
const Apify = require('apify');
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: Crawl some links on a website
id: crawl-some-links
---

This [`CheerioCrawler`](/docs/api/cheerio-crawler) example uses the [`pseudoUrls`](/docs/api/pseudo-url) property in the `Apify.enqueueLinks()` method
This [`CheerioCrawler`](/docs/1.3/api/cheerio-crawler) example uses the [`pseudoUrls`](/docs/1.3/api/pseudo-url) property in the `Apify.enqueueLinks()` method
to only add links to the `RequestList` queue if they match the specified regular expression.

```javascript
Expand Down
4 changes: 2 additions & 2 deletions website/versioned_docs/version-1.3/examples/forms.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ title: Forms
id: forms
---

This example demonstrates how to use [`PuppeteerCrawler`](/docs/api/puppeteer-crawler) to automatically fill and submit a search form to look up
This example demonstrates how to use [`PuppeteerCrawler`](/docs/1.3/api/puppeteer-crawler) to automatically fill and submit a search form to look up
repositories on [GitHub](https://github.com) using headless Chrome / Puppeteer. The actor first fills in the search term, repository owner, start date
and language of the repository, then submits the form and prints out the results. Finally, the results are saved either on the Apify platform to the
default [`dataset`](/docs/api/dataset) or on the local machine as JSON files in `./apify_storage/datasets/default`.
default [`dataset`](/docs/1.3/api/dataset) or on the local machine as JSON files in `./apify_storage/datasets/default`.

> To run this example on the Apify Platform, select the `apify/actor-node-puppeteer-chrome` image for your Dockerfile.
Expand Down
14 changes: 7 additions & 7 deletions website/versioned_docs/version-1.3/examples/map_and_reduce.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ title: Dataset Map and Reduce methods
id: map-and-reduce
---

This example shows an easy use-case of the [Apify dataset](https://docs.apify.com/storage/dataset) [`map`](/docs/api/dataset#map) and
[`reduce`](/docs/api/dataset#reduce) methods. Both methods can be used to simplify the dataset results workflow process. Both can be called on the
[dataset](/docs/api/dataset) directly.
This example shows an easy use-case of the [Apify dataset](https://docs.apify.com/storage/dataset) [`map`](/docs/1.3/api/dataset#map) and
[`reduce`](/docs/1.3/api/dataset#reduce) methods. Both methods can be used to simplify the dataset results workflow process. Both can be called on the
[dataset](/docs/1.3/api/dataset) directly.

Important to mention is that both methods return a new result (`map` returns a new array and `reduce` can return any type) - neither method updates
the dataset in any way.
Expand All @@ -15,7 +15,7 @@ Examples for both methods are demonstrated on a simple dataset containing the re
`h1` - `h3` header elements under the `headingCount` key.

This data structure is stored in the default dataset under `{PROJECT_FOLDER}/apify_storage/datasets/default/`. If you want to simulate the
functionality, you can use the [`dataset.PushData()`](/docs/api/dataset#pushdata) method to save the example `JSON array` to your dataset.
functionality, you can use the [`dataset.PushData()`](/docs/1.3/api/dataset#pushdata) method to save the example `JSON array` to your dataset.

```json
[
Expand Down Expand Up @@ -58,7 +58,7 @@ Apify.main(async () => {

The `moreThan5headers` variable is an array of `headingCount` attributes where the number of headers is greater than 5.

The `map` method's result value saved to the [`key-value store`](/docs/api/key-value-store) should be:
The `map` method's result value saved to the [`key-value store`](/docs/1.3/api/key-value-store) should be:

```javascript
[11, 8];
Expand All @@ -67,7 +67,7 @@ The `map` method's result value saved to the [`key-value store`](/docs/api/key-v
### Reduce

The dataset `reduce` method does not produce a new array of values - it reduces a list of values down to a single value. The method iterates through
the items in the dataset using the [`memo` argument](/docs/api/dataset#datasetreduceiteratee-memo-options). After performing the necessary
the items in the dataset using the [`memo` argument](/docs/1.3/api/dataset#datasetreduceiteratee-memo-options). After performing the necessary
calculation, the `memo` is sent to the next iteration, while the item just processed is reduced (removed).

Using the `reduce` method to get the total number of headers scraped (all items in the dataset):
Expand All @@ -93,7 +93,7 @@ Apify.main(async () => {
The original dataset will be reduced to a single value, `pagesHeadingCount`, which contains the count of all headers for all scraped pages (all
dataset items).

The `reduce` method's result value saved to the [key-value store](/docs/api/key-value-store) should be:
The `reduce` method's result value saved to the [key-value store](/docs/1.3/api/key-value-store) should be:

```javascript
23;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: Puppeteer crawler
id: puppeteer-crawler
---

This example demonstrates how to use [`PuppeteerCrawler`](/docs/api/puppeteer-crawler) in combination with [`RequestQueue`](/docs/api/request-queue)
This example demonstrates how to use [`PuppeteerCrawler`](/docs/1.3/api/puppeteer-crawler) in combination with [`RequestQueue`](/docs/1.3/api/request-queue)
to recursively scrape the [Hacker News website](https://news.ycombinator.com) using headless Chrome / Puppeteer. The crawler starts with a single URL,
finds links to next pages, enqueues them and continues until no more desired links are available. The results are stored to the default dataset. In
local configuration, the results are stored as JSON files in `./apify_storage/datasets/default`
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: Puppeteer recursive crawl
id: puppeteer-recursive-crawl
---

Run the following example to perform a recursive crawl of a website using [`PuppeteerCrawler`](/docs/api/puppeteer-crawler).
Run the following example to perform a recursive crawl of a website using [`PuppeteerCrawler`](/docs/1.3/api/puppeteer-crawler).

> To run this example on the Apify Platform, select the `apify/actor-node-puppeteer-chrome` image for your Dockerfile.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: Puppeteer sitemap
id: puppeteer-sitemap
---

This example demonstrates how to use [`PuppeteerCrawler`](/docs/api/puppeteer-crawler) to crawl a list of web pages specified in a sitemap. The
This example demonstrates how to use [`PuppeteerCrawler`](/docs/1.3/api/puppeteer-crawler) to crawl a list of web pages specified in a sitemap. The
crawler extracts the page title and URL from each page and stores them as a record in the default dataset. In local configuration, the results are
stored as JSON files in `./apify_storage/datasets/default`.

Expand Down
4 changes: 2 additions & 2 deletions website/versioned_docs/version-1.3/examples/screenshots.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ title: Screenshots
id: screenshots
---

This example demonstrates how to read and write data to the default key-value store using [`Apify.getValue()`](/docs/api/apify#apifygetvaluekey) and
[`Apify.setValue()`](/docs/api/apify#apifysetvaluekey-value-options).
This example demonstrates how to read and write data to the default key-value store using [`Apify.getValue()`](/docs/1.3/api/apify#apifygetvaluekey) and
[`Apify.setValue()`](/docs/1.3/api/apify#apifysetvaluekey-value-options).

The script crawls a list of URLs using Puppeteer, captures a screenshot of each page, and saves it to the store. The list of URLs is provided as actor
input that is also read from the store.
Expand Down
2 changes: 1 addition & 1 deletion website/versioned_docs/version-1.3/guides/quick_start.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ Apify.main(async () => {
});
```

> To read more about what pseudo-URL is, check the [getting-started](getting_started#introduction-to-pseudo-urls).
> To read more about what pseudo-URL is, check the [getting-started](./getting_started.md#introduction-to-pseudo-urls).
When you run the example, you should see Apify SDK automating a Chrome browser.

Expand Down
6 changes: 3 additions & 3 deletions website/versioned_docs/version-1.3/guides/request_storage.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ The request queue is a storage of URLs to crawl. The queue is used for the deep

Each actor run is associated with a **default request queue**, which is created exclusively for the actor run. Typically, it is used to store URLs to crawl in the specific actor run. Its usage is optional.

In Apify SDK, the request queue is represented by the [`RequestQueue`](/docs/api/request-queue) class.
In Apify SDK, the request queue is represented by the [`RequestQueue`](/docs/1.3/api/request-queue) class.

In local configuration, the request queue is emulated by [@apify/storage-local](https://github.com/apify/apify-storage-local-js) NPM package and its data is stored in SQLite database in the directory specified by the `APIFY_LOCAL_STORAGE_DIR` environment variable as follows:

Expand Down Expand Up @@ -55,11 +55,11 @@ To see more detailed example of how to use the request queue with a crawler, see

## Request list

The request list is not a storage per se - it represents the list of URLs to crawl that is stored in a run memory (or optionally in default [Key-Value Store](../guides/results-storage#key-value-store) associated with the run, if specified). The list is used for the crawling of a large number of URLs, when you know all the URLs which should be visited by the crawler and no URLs would be added during the run. The URLs can be provided either in code or parsed from a text file hosted on the web.
The request list is not a storage per se - it represents the list of URLs to crawl that is stored in a run memory (or optionally in default [Key-Value Store](./result_storage.md#key-value-store) associated with the run, if specified). The list is used for the crawling of a large number of URLs, when you know all the URLs which should be visited by the crawler and no URLs would be added during the run. The URLs can be provided either in code or parsed from a text file hosted on the web.

Request list is created exclusively for the actor run and only if its usage is explicitly specified in the code. Its usage is optional.

In Apify SDK, the request list is represented by the [`RequestList`](/docs/api/request-list) class.
In Apify SDK, the request list is represented by the [`RequestList`](/docs/1.3/api/request-list) class.

The following code demonstrates basic operations of the request list:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ User-function used in the `Dataset.forEach()` API.
**Parameters**:

- **`item`**: `object` - Current [`Dataset`](../api/dataset) entry being processed.
- **`index`**: `number` - Position of current {Dataset} entry.
- **`index`**: `number` - Position of current [`Dataset`](../api/dataset) entry.

**Returns**:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ User-function used in the `Dataset.map()` API.
**Parameters**:

- **`item`**: `object` - Currect [`Dataset`](../api/dataset) entry being processed.
- **`index`**: `number` - Position of current {Dataset} entry.
- **`index`**: `number` - Position of current [`Dataset`](../api/dataset) entry.

**Returns**:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ User-function used in the `Dataset.reduce()` API.

- **`memo`**: `object` - Previous state of the reduction.
- **`item`**: `object` - Currect [`Dataset`](../api/dataset) entry being processed.
- **`index`**: `number` - Position of current {Dataset} entry.
- **`index`**: `number` - Position of current [`Dataset`](../api/dataset) entry.

**Returns**:

Expand Down
2 changes: 1 addition & 1 deletion website/versioned_docs/version-1.3/typedefs/KeyConsumer.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ User-function used in the [`KeyValueStore.forEachKey()`](../api/key-value-store#

**Parameters**:

- **`key`**: `string` - Current {KeyValue} key being processed.
- **`key`**: `string` - Current key being processed.
- **`index`**: `number` - Position of the current key in [`KeyValueStore`](../api/key-value-store).
- **`info`**: `*` - Information about the current [`KeyValueStore`](../api/key-value-store) entry.
- **`size`**: `number` - Size of the value associated with the current key in bytes.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ id: request-transform

<a name="requesttransform"></a>

Takes an Apify {RequestOptions} object and changes it's attributes in a desired way. This user-function is used
Takes an Apify [`RequestOptions`](./request-options) object and changes it's attributes in a desired way. This user-function is used
[`utils.enqueueLinks()`](../api/utils#enqueuelinks) to modify requests before enqueuing them.

**Parameters**:
Expand Down
Loading

0 comments on commit fe2b517

Please sign in to comment.