Skip to content

Commit

Permalink
Fix capitalization in file titles (#1341)
Browse files Browse the repository at this point in the history
  • Loading branch information
burnash authored May 8, 2024
1 parent 756be4e commit e48da74
Show file tree
Hide file tree
Showing 15 changed files with 20 additions and 30 deletions.
2 changes: 1 addition & 1 deletion docs/examples/qdrant_zendesk/qdrant_zendesk.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
"""
---
title: Similarity Searching with Qdrant
title: Similarity searching with Qdrant
description: Learn how to use the dlt source, Zendesk and dlt destination, Qdrant to conduct a similarity search on your tickets data.
keywords: [similarity search, example]
---
Expand Down
2 changes: 1 addition & 1 deletion docs/website/docs/build-a-pipeline-tutorial.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Pipeline Tutorial
title: Pipeline tutorial
description: Build a data pipeline with dlt from scratch
keywords: [getting started, quick start, basics]
---
Expand Down
2 changes: 1 addition & 1 deletion docs/website/docs/dlt-ecosystem/file-formats/csv.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ description: The csv file format
keywords: [csv, file formats]
---

# CSV File Format
# CSV file format

**csv** is the most basic file format to store tabular data, where all the values are strings and are separated by a delimiter (typically comma).
`dlt` uses it for specific use cases - mostly for the performance and compatibility reasons.
Expand Down
2 changes: 1 addition & 1 deletion docs/website/docs/dlt-ecosystem/file-formats/parquet.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ description: The parquet file format
keywords: [parquet, file formats]
---

# Parquet File Format
# Parquet file format

[Apache Parquet](https://en.wikipedia.org/wiki/Apache_Parquet) is a free and open-source column-oriented data storage format in the Apache Hadoop ecosystem. `dlt` is capable of storing data in this format when configured to do so.

Expand Down
2 changes: 1 addition & 1 deletion docs/website/docs/dlt-ecosystem/verified-sources/index.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Verified Sources
title: Verified sources
description: List of verified sources
keywords: ['verified source']
---
Expand Down
Original file line number Diff line number Diff line change
@@ -1,13 +1,10 @@
---
title: Configuration Providers
title: Configuration providers
description: Where dlt looks for config/secrets and in which order.
keywords: [credentials, secrets.toml, secrets, config, configuration, environment
variables, provider]
---

# Configuration Providers


Configuration Providers in the context of the `dlt` library
refer to different sources from which configuration values
and secrets can be retrieved for a data pipeline.
Expand Down
4 changes: 1 addition & 3 deletions docs/website/docs/general-usage/credentials/config_specs.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
---
title: Configuration Specs
title: Configuration specs
description: How to specify complex custom configurations
keywords: [credentials, secrets.toml, secrets, config, configuration, environment
variables, specs]
---

# Configuration Specs

Configuration Specs in `dlt` are Python dataclasses that define how complex configuration values,
particularly credentials, should be handled.
They specify the types, defaults, and parsing methods for these values.
Expand Down
4 changes: 1 addition & 3 deletions docs/website/docs/general-usage/credentials/configuration.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,10 @@
---
title: Secrets and Configs
title: Secrets and configs
description: What are secrets and configs and how sources and destinations read them.
keywords: [credentials, secrets.toml, secrets, config, configuration, environment
variables]
---

# Secrets and Configs

Use secret and config values to pass access credentials and configure or fine-tune your pipelines without the need to modify your code.
When done right you'll be able to run the same pipeline script during development and in production.

Expand Down
2 changes: 1 addition & 1 deletion docs/website/docs/general-usage/glossary.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ The data store where data from the source is loaded (e.g. Google BigQuery).
Moves the data from the source to the destination, according to instructions provided in the schema
(i.e. extracting, normalizing, and loading the data).

## [Verified Source](../walkthroughs/add-a-verified-source)
## [Verified source](../walkthroughs/add-a-verified-source)

A Python module distributed with `dlt init` that allows creating pipelines that extract data from a
particular **Source**. Such module is intended to be published in order for others to use it to
Expand Down
4 changes: 1 addition & 3 deletions docs/website/docs/general-usage/schema-contracts.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,9 @@
---
title: 🧪 Schema and Data Contracts
title: 🧪 Schema and data contracts
description: Controlling schema evolution and validating data
keywords: [data contracts, schema, dlt schema, pydantic]
---

## Schema and Data Contracts

`dlt` will evolve the schema at the destination by following the structure and data types of the extracted data. There are several modes
that you can use to control this automatic schema evolution, from the default modes where all changes to the schema are accepted to
a frozen schema that does not change at all.
Expand Down
7 changes: 4 additions & 3 deletions docs/website/docs/general-usage/schema-evolution.md
Original file line number Diff line number Diff line change
@@ -1,24 +1,25 @@
---
title: Schema Evolution
title: Schema evolution
description: A small guide to elaborate on how schema evolution works
keywords: [schema evolution, schema, dlt schema]
---

# Schema evolution

## When to use schema evolution?

Schema evolution is a best practice when ingesting most data. It’s simply a way to get data across a format barrier.

It separates the technical challenge of “loading” data, from the business challenge of “curating” data. This enables us to have pipelines that are maintainable by different individuals at different stages.

However, for cases where schema evolution might be triggered by malicious events, such as in web tracking, data contracts are advised. Read more about how to implement data contracts [here](https://dlthub.com/docs/general-usage/schema-contracts).

## Schema evolution with `dlt`

`dlt` automatically infers the initial schema for your first pipeline run. However, in most cases, the schema tends to change over time, which makes it critical for downstream consumers to adapt to schema changes.

As the structure of data changes, such as the addition of new columns, changing data types, etc., `dlt` handles these schema changes, enabling you to adapt to changes without losing velocity.

## Inferring a schema from nested data

The first run of a pipeline will scan the data that goes through it and generate a schema. To convert nested data into relational format, `dlt` flattens dictionaries and unpacks nested lists into sub-tables.

We’ll review some examples here and figure out how `dlt` creates initial schema and how normalisation works. Consider a pipeline that loads the following schema:
Expand Down
6 changes: 2 additions & 4 deletions docs/website/docs/reference/command-line-interface.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,9 @@
---
title: Command Line Interface
description: Command Line Interface (CLI) of dlt
title: Command line interface
description: Command line interface (CLI) of dlt
keywords: [command line interface, cli, dlt init]
---

# Command Line Interface

## `dlt init`

```sh
Expand Down
2 changes: 1 addition & 1 deletion docs/website/docs/reference/frequently-asked-questions.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Frequently Asked Questions
title: Frequently asked questions
description: Questions asked frequently by users in technical help or github issues
keywords: [faq, usage information, technical help]
---
Expand Down
2 changes: 1 addition & 1 deletion docs/website/docs/tutorial/grouping-resources.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Resource Grouping and Secrets
title: Resource grouping and secrets
description: Advanced tutorial on loading data from an API
keywords: [api, source, decorator, dynamic resource, github, tutorial]
---
Expand Down
4 changes: 2 additions & 2 deletions docs/website/docs/walkthroughs/zendesk-weaviate.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
---
title: 'Import Ticket Data from Zendesk API to Weaviate'
title: 'Import ticket data from Zendesk API to Weaviate'
description: How to Import Ticket Data from Zendesk API to Weaviate
keywords: [how to, zendesk, weaviate, vector database, vector search]
---

# How to Import Ticket Data from Zendesk API to Weaviate
# How to import ticket data from Zendesk API to Weaviate

Zendesk is a cloud-based customer service and support platform. Zendesk Support API, which is also known as the Ticketing API lets’s you access support tickets data. By analyzing this data, businesses can gain insights into customer needs, behavior, trends, and make data-driven decisions. The newest type of databases, vector databases, can help in advanced analysis of tickets data such as identifying common issues and sentiment analysis.

Expand Down

0 comments on commit e48da74

Please sign in to comment.