Skip to content

Commit

Permalink
Fixed all remaining broken links (Unstructured-IO#49)
Browse files Browse the repository at this point in the history
  • Loading branch information
MKhalusova authored May 28, 2024
1 parent 7b2a591 commit 0033a95
Show file tree
Hide file tree
Showing 6 changed files with 16 additions and 16 deletions.
2 changes: 1 addition & 1 deletion faq/faq.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ curl -X 'POST' \
-F 'languages=kor' \
| jq -C . | less -R
```
For comprehensive language support, refer to the [Tesseract documentation]("https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html"), which provides a detailed list of supported languages and installation guidelines.
For comprehensive language support, refer to the [Tesseract documentation](https://tesseract-ocr.github.io/tessdoc/Data-Files-in-different-versions.html), which provides a detailed list of supported languages and installation guidelines.

<Note>You can still use ``ocr_languages`` kwarg, but this parameter is being deprecated in favor of ``languages`` kwarg.</Note>

Expand Down
2 changes: 1 addition & 1 deletion open-source/core-functionality/staging.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: Staging

<Warning>

The `Staging` brick is being deprecated in favor of the new and more comprehensive `Destination Connectors`. To explore the complete list and usage, please refer to [Destination Connectors documentation](/ingest/destination-connectors/overview).
The `Staging` brick is being deprecated in favor of the new and more comprehensive `Destination Connectors`. To explore the complete list and usage, please refer to [Destination Connectors documentation](../ingest/destination-connectors/overview).

Note: We are constantly expanding our collection of destination connectors. If you wish to request a specific Destination Connector, you’re encouraged to submit a Feature Request on the [Unstructured GitHub repository](https://github.com/Unstructured-IO/unstructured/issues/new/choose).
</Warning>
Expand Down
6 changes: 3 additions & 3 deletions open-source/introduction/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ sidebarTitle: Overview
* and more!
</Note>

The [`unstructured` library]((https://github.com/Unstructured-IO/unstructured)) offers an open-source toolkit
The [`unstructured` library](https://github.com/Unstructured-IO/unstructured) offers an open-source toolkit
designed to simplify the ingestion and pre-processing of diverse data formats, including images and text-based documents
such as PDFs, HTML files, Word documents, and more. With a focus on optimizing data workflows for Large Language Models (LLMs),
`unstructured` provides modular functions and connectors that work seamlessly together. This cohesive system ensures
Expand All @@ -28,7 +28,7 @@ and use cases.

## Key functionality

* **Precise Document Extraction**: Unstructured offers advanced capabilities in extracting elements and metadata from documents. This includes a variety of document element types and metadata. Learn more about [Document elements and metadata](/open-source/introduction/document-elements).
* **Precise Document Extraction**: Unstructured offers advanced capabilities in extracting elements and metadata from documents. This includes a variety of document element types and metadata. Learn more about [Document elements and metadata](../concepts/document-elements).

* **Extensive File Support**: The platform supports a wide array of file types, ensuring versatility in handling different document formats from PDF, Images, HTML, and many more. Detailed information on supported file types can be found [here](/api-reference/api-services/overview#supported-file-types).

Expand All @@ -46,7 +46,7 @@ and use cases.

* [Embedding](/open-source/core-functionality/embedding): The embedding encoder classes in Unstructured leverage document elements detected through partitioning or grouped via chunking to obtain embeddings for each element. This is particularly useful for applications like Retrieval Augmented Generation (RAG), where precise and contextually relevant embeddings are crucial.

* **High-performant Connectors**: The platform includes optimized connectors for efficient data ingestion and output. These comprise [Source Connectors](/ingest/destination-connectors/overview) for data input and [Destination Connectors](/ingest/destination-connectors/overview) for data export.
* **High-performant Connectors**: The platform includes optimized connectors for efficient data ingestion and output. These comprise [Source Connectors](../ingest/source-connectors/overview) for data input and [Destination Connectors](../ingest/destination-connectors/overview) for data export.


## Common Use Cases
Expand Down
2 changes: 1 addition & 1 deletion open-source/introduction/quick-start.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ The following section will cover basic concepts and usage patterns in `unstructu

The example documents in this section come from the [example-docs](https://github.com/Unstructured-IO/unstructured/tree/main/example-docs) directory in the `unstructured` repo.

Before running the code in this make sure you’ve installed the `unstructured` library and all dependencies using the instructions in the [Quick Start](/installation/overview#quick-start) section.
Before running the code in this make sure you’ve installed the `unstructured` library and all dependencies using the instructions in the [Quick Start](../installation/overview#quick-start) section.

## Partitioning a document

Expand Down
12 changes: 6 additions & 6 deletions platform/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ To **get your data RAG-ready** our platform moves it through the following proce
```
<Steps>
<Step title="Connect">
We offer multiple [Source Connectors](/content/text). We can connect to your data in its existing location.
We offer multiple [Source Connectors](../platform/platform-source-connectors/overview). We can connect to your data in its existing location.
</Step>
<Step title="Route">
**Routing** determines which strategy we will employ in **transforming your document to our canonical JSON schema**. There are three [Partioning Strategies](/api-reference/api-services/partitioning "partioning strategies") for document transformation, ```fast```, ```hires```, or ```ocr_only```. ```fast``` is great for when there is extractable text available, like in HTML files or in the Microsoft Office Document format. ```hires``` is best for PDFs and tables and where accurate classification of document elements is critical. ```ocr_only``` is useful when dealing with image-based files or PDFs that do not have extractable text. **If you're unsure, select ```auto``` and we'll handle the decision for you**.
Expand All @@ -35,7 +35,7 @@ To **get your data RAG-ready** our platform moves it through the following proce
Call out to third party embedding providers, ```Open AI```, ```AWS Bedrock```, and ```Octo ML```.
</Step>
<Step title="Persist">
We have multiple [Destination Connectors](/content/text). Including **all major vector databases**.
We have multiple [Destination Connectors](../platform/platform-destination-connectors/overview). Including **all major vector databases**.
</Step>
</Steps>

Expand All @@ -53,10 +53,10 @@ To simplify this process and provide it as a no-code solution, platform consist
Workflow[3. Workflow] --> Jobs[4. Jobs]
```

1. [Source Connectors](platform-source-connectors/overview) to ingest your data.
2. [Destination Connectors](platform-destination-connectors/overview) tell our system where to write your transformed data too..
3. [Workflows](workflows-automation) connect sources to destinations and provide chunking, embedding, and scheduling options.
4. [Jobs](jobs-scheduling) allow you to monitor data transformation progress.
1. [Source Connectors](../platform/platform-source-connectors/overview) to ingest your data.
2. [Destination Connectors](../platform/platform-destination-connectors/overview) tell our system where to write your transformed data too..
3. [Workflows](../platform/workflows-automation) connect sources to destinations and provide chunking, embedding, and scheduling options.
4. [Jobs](../platform/jobs-scheduling) allow you to monitor data transformation progress.


### Compliance
Expand Down
8 changes: 4 additions & 4 deletions platform/saas-platform-guide.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -10,16 +10,16 @@ This page describes how to get started with the SaaS Unstructured Platform. Lear
You can [sign-up here](https://unstructured.io/platform) to get started.
</Step>
<Step title="Configure a Source Connector">
Configure a ```Source Connector``` to [Amazon S3](platform-source-connectors/s3) (or a [source of your choice](platform-source-connectors/overview)).
Configure a ```Source Connector``` to [Amazon S3](../platform/platform-source-connectors/s3) (or a [source of your choice](../platform/platform-source-connectors/overview)).
</Step>
<Step title="Configure a Destination Connector">
Configure a ```Destination Connector``` to [Mongo DB Atlas Vector DB](platform-destination-connectors/mongodb) (or a [destination of your choice](platform-destination-connectors/overview)).
Configure a ```Destination Connector``` to [Mongo DB Atlas Vector DB](../platform/platform-destination-connectors/mongodb) (or a [destination of your choice](../platform/platform-destination-connectors/overview)).
</Step>
<Step title="Create a Workflow">
Connect your ```Source Connector``` and ```Destination Connector``` together, and set options for ```chunking```, ```embedding```, and ```scheduling``` in [Workflows](workflows-automation).
Connect your ```Source Connector``` and ```Destination Connector``` together, and set options for ```chunking```, ```embedding```, and ```scheduling``` in [Workflows](../platform/workflows-automation).
</Step>
<Step title="Monitor your Job">
```Jobs``` provide a location for determining [how your workflow is performing](jobs-scheduling).
```Jobs``` provide a location for determining [how your workflow is performing](../platform/jobs-scheduling).
</Step>
<Step title="Done">
Your unstructured **data will continuously flow** into your destination as and when new files/updates become available.
Expand Down

0 comments on commit 0033a95

Please sign in to comment.