Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge: 20231228 #469

Merged
merged 3 commits into from
Dec 28, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/regression-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ jobs:
-t \
--name polardb_${{ matrix.container_image }} \
-v `pwd`:/home/postgres/PolarDB-for-PostgreSQL \
mrdrivingduck/polardb_pg_devel:${{ matrix.container_image }} \
polardb/polardb_pg_devel:${{ matrix.container_image }} \
bash && \
docker start polardb_${{ matrix.container_image }}

Expand Down
10 changes: 5 additions & 5 deletions README-CN.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
<div align="center">

[![logo](docs/.vuepress/public/images/polardb.png)](https://developer.aliyun.com/topic/polardb-for-pg)
[![logo](docs/.vuepress/public/images/polardb.png)](https://www.polardbpg.com/home)

# PolarDB for PostgreSQL

**阿里云自主研发的云原生数据库产品**

#### [English](README.md) | 简体中文

[![official](https://img.shields.io/badge/官方网站-blueviolet?style=for-the-badge&logo=alibabacloud)](https://developer.aliyun.com/topic/polardb-for-pg)
[![official](https://img.shields.io/badge/官方网站-blueviolet?style=for-the-badge&logo=alibabacloud)](https://www.polardbpg.com/home)

[![cirrus-ci-stable](https://img.shields.io/cirrus/github/ApsaraDB/PolarDB-for-PostgreSQL/POLARDB_11_STABLE?style=for-the-badge&logo=cirrusci)](https://cirrus-ci.com/github/ApsaraDB/PolarDB-for-PostgreSQL/POLARDB_11_STABLE)
[![cirrus-ci-dev](https://img.shields.io/cirrus/github/ApsaraDB/PolarDB-for-PostgreSQL/POLARDB_11_DEV?style=for-the-badge&logo=cirrusci)](https://cirrus-ci.com/github/ApsaraDB/PolarDB-for-PostgreSQL/POLARDB_11_DEV)
Expand Down Expand Up @@ -58,11 +58,11 @@ PolarDB 采用了基于 Shared-Storage 的存储计算分离架构。数据库

```bash
# 拉取单节点 PolarDB 镜像
docker pull polardb/polardb_pg_local_instance:single
docker pull polardb/polardb_pg_local_instance
# 创建运行并进入容器
docker run -it --cap-add=SYS_PTRACE --privileged=true --name polardb_pg_single polardb/polardb_pg_local_instance:single bash
docker run -it --rm polardb/polardb_pg_local_instance psql
# 测试实例可用性
psql -h 127.0.0.1 -c 'select version();'
postgres=# SELECT version();
version
--------------------------------
PostgreSQL 11.9 (POLARDB 11.9)
Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
<div align="center">

[![logo](docs/.vuepress/public/images/polardb.png)](https://developer.aliyun.com/topic/polardb-for-pg)
[![logo](docs/.vuepress/public/images/polardb.png)](https://www.polardbpg.com/home)

# PolarDB for PostgreSQL

**A cloud-native database developed by Alibaba Cloud**

#### English | [简体中文](README-CN.md)

[![official](https://img.shields.io/badge/official%20site-blueviolet?style=for-the-badge&logo=alibabacloud)](https://developer.aliyun.com/topic/polardb-for-pg)
[![official](https://img.shields.io/badge/official%20site-blueviolet?style=for-the-badge&logo=alibabacloud)](https://www.polardbpg.com/home)

[![cirrus-ci-stable](https://img.shields.io/cirrus/github/ApsaraDB/PolarDB-for-PostgreSQL/POLARDB_11_STABLE?style=for-the-badge&logo=cirrusci)](https://cirrus-ci.com/github/ApsaraDB/PolarDB-for-PostgreSQL/POLARDB_11_STABLE)
[![cirrus-ci-dev](https://img.shields.io/cirrus/github/ApsaraDB/PolarDB-for-PostgreSQL/POLARDB_11_DEV?style=for-the-badge&logo=cirrusci)](https://cirrus-ci.com/github/ApsaraDB/PolarDB-for-PostgreSQL/POLARDB_11_DEV)
Expand Down Expand Up @@ -58,11 +58,11 @@ If you have Docker installed already,then you can pull the instance image of P

```bash
# pull the instance image from DockerHub
docker pull polardb/polardb_pg_local_instance:single
docker pull polardb/polardb_pg_local_instance
# create, run and enter the container
docker run -it --cap-add=SYS_PTRACE --privileged=true --name polardb_pg_single polardb/polardb_pg_local_instance:single bash
docker run -it --rm polardb/polardb_pg_local_instance psql
# check
psql -h 127.0.0.1 -c 'select version();'
postgres=# SELECT version();
version
--------------------------------
PostgreSQL 11.9 (POLARDB 11.9)
Expand Down
5 changes: 3 additions & 2 deletions docs/.vuepress/configs/navbar/zh.ts
Original file line number Diff line number Diff line change
Expand Up @@ -67,10 +67,10 @@ export const zh: NavbarConfig = [
],
},
{
text: "内核增强功能",
text: "自研功能",
children: [
{
text: "文档入口",
text: "功能总览",
link: "/zh/features/",
},
{
Expand All @@ -81,6 +81,7 @@ export const zh: NavbarConfig = [
"/zh/features/v11/availability/",
"/zh/features/v11/security/",
"/zh/features/v11/epq/",
"/zh/features/v11/extensions/",
],
},
],
Expand Down
10 changes: 9 additions & 1 deletion docs/.vuepress/configs/sidebar/zh.ts
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ export const zh: SidebarConfig = {
],
"/zh/features": [
{
text: "内核增强功能",
text: "自研功能",
link: "/zh/features/",
children: [
{
Expand Down Expand Up @@ -122,6 +122,14 @@ export const zh: SidebarConfig = {
"/zh/features/v11/epq/epq-ctas-mtview-bulk-insert.md",
],
},
{
text: "第三方插件",
link: "/zh/features/v11/extensions/",
children: [
"/zh/features/v11/extensions/pgvector.md",
"/zh/features/v11/extensions/smlar.md",
],
},
],
},
],
Expand Down
4 changes: 2 additions & 2 deletions docs/deploying/fs-pfs.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,13 @@ PolarDB File System,简称 PFS 或 PolarFS,是由阿里云自主研发的高
推荐使用 [DockerHub](https://hub.docker.com/u/polardb) 上的 PolarDB for PostgreSQL [可执行文件镜像](https://hub.docker.com/r/polardb/polardb_pg_binary/tags),目前支持 `linux/amd64` 和 `linux/arm64` 两种架构,其中已经包含了编译完毕的 PFS 工具,无需手动编译安装。通过以下命令进入容器即可:

```shell:no-line-numbers
docker pull polardb/polardb_pg_binary:pfs
docker pull polardb/polardb_pg_binary
docker run -it \
--cap-add=SYS_PTRACE \
--privileged=true \
--name polardb_pg \
--shm-size=512m \
polardb/polardb_pg_binary:pfs \
polardb/polardb_pg_binary \
bash
```

Expand Down
4 changes: 2 additions & 2 deletions docs/operation/ro-online-promote.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,13 +19,13 @@ PolarDB for PostgreSQL 是一款存储与计算分离的云原生数据库,所
为方便起见,本示例使用基于本地磁盘的实例来进行演示。拉取如下镜像并启动容器,可以得到一个基于本地磁盘的 HTAP 实例:

```shell:no-line-numbers
docker pull polardb/polardb_pg_local_instance:htap
docker pull polardb/polardb_pg_local_instance
docker run -it \
--cap-add=SYS_PTRACE \
--privileged=true \
--name polardb_pg_htap \
--shm-size=512m \
polardb/polardb_pg_local_instance:htap \
polardb/polardb_pg_local_instance \
bash
```

Expand Down
8 changes: 4 additions & 4 deletions docs/operation/scale-out.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,13 @@ PolarDB for PostgreSQL 是一款存储与计算分离的数据库,所有计算
首先,在已经搭建完毕的共享存储集群上,初始化并启动第一个计算节点,即读写节点,该节点可以对共享存储进行读写。我们在下面的镜像中提供了已经编译完毕的 PolarDB for PostgreSQL 内核和周边工具的可执行文件:

```shell:no-line-numbers
$ docker pull polardb/polardb_pg_binary:pfs
$ docker pull polardb/polardb_pg_binary
$ docker run -it \
--cap-add=SYS_PTRACE \
--privileged=true \
--name polardb_pg \
--shm-size=512m \
polardb/polardb_pg_binary:pfs \
polardb/polardb_pg_binary \
bash

$ ls ~/tmp_basedir_polardb_pg_1100_bld/bin/
Expand Down Expand Up @@ -130,13 +130,13 @@ $HOME/tmp_basedir_polardb_pg_1100_bld/bin/psql \
类似地,在用于部署新计算节点的机器上,拉取镜像并启动带有可执行文件的容器:

```shell:no-line-numbers
docker pull polardb/polardb_pg_binary:pfs
docker pull polardb/polardb_pg_binary
docker run -it \
--cap-add=SYS_PTRACE \
--privileged=true \
--name polardb_pg \
--shm-size=512m \
polardb/polardb_pg_binary:pfs \
polardb/polardb_pg_binary \
bash
```

Expand Down
4 changes: 2 additions & 2 deletions docs/operation/tpch-test.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,13 +23,13 @@ minute: 20
使用 Docker 快速拉起一个基于本地存储的 PolarDB for PostgreSQL 集群:

```shell:no-line-numbers
docker pull polardb/polardb_pg_local_instance:htap
docker pull polardb/polardb_pg_local_instance
docker run -it \
--cap-add=SYS_PTRACE \
--privileged=true \
--name polardb_pg_htap \
--shm-size=512m \
polardb/polardb_pg_local_instance:htap \
polardb/polardb_pg_local_instance \
bash
```

Expand Down
3 changes: 2 additions & 1 deletion docs/zh/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,12 +49,13 @@ postgres=# SELECT version();
</div>

<div class="feature">
<h3>内核增强功能</h3>
<h3>自研功能</h3>
<ul style="position: relative;z-index: 10;">
<li><a href="./features/v11/performance/">高性能</a></li>
<li><a href="./features/v11/availability/">高可用</a></li>
<li><a href="./features/v11/security/">安全</a></li>
<li><a href="./features/v11/epq/">弹性跨机并行查询(ePQ)</a></li>
<li><a href="./features/v11/extensions/">第三方插件</a></li>
</ul>
</div>

Expand Down
4 changes: 2 additions & 2 deletions docs/zh/deploying/fs-pfs.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,13 @@ PolarDB File System,简称 PFS 或 PolarFS,是由阿里云自主研发的高
推荐使用 [DockerHub](https://hub.docker.com/u/polardb) 上的 PolarDB for PostgreSQL [可执行文件镜像](https://hub.docker.com/r/polardb/polardb_pg_binary/tags),目前支持 `linux/amd64` 和 `linux/arm64` 两种架构,其中已经包含了编译完毕的 PFS 工具,无需手动编译安装。通过以下命令进入容器即可:

```shell:no-line-numbers
docker pull polardb/polardb_pg_binary:pfs
docker pull polardb/polardb_pg_binary
docker run -it \
--cap-add=SYS_PTRACE \
--privileged=true \
--name polardb_pg \
--shm-size=512m \
polardb/polardb_pg_binary:pfs \
polardb/polardb_pg_binary \
bash
```

Expand Down
17 changes: 16 additions & 1 deletion docs/zh/features/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# 内核增强功能
# 自研功能

- [PolarDB for PostgreSQL 11](./v11/README.md)

Expand Down Expand Up @@ -118,5 +118,20 @@
<td style="text-align:center">/</td>
<td style="text-align:center"><a href="./v11/epq/epq-ctas-mtview-bulk-insert.html"><Badge type="tip" text="V11 / v1.1.30-" vertical="top" /></a></td>
</tr>
<tr>
<td><strong>第三方插件</strong></td>
<td style="text-align:center">...</td>
<td style="text-align:center"><a href="./v11/extensions/">...</a></td>
</tr>
<tr>
<td>pgvector</td>
<td style="text-align:center">/</td>
<td style="text-align:center"><a href="./v11/extensions/pgvector.html"><Badge type="tip" text="V11 / v1.1.35-" vertical="top" /></a></td>
</tr>
<tr>
<td>smlar</td>
<td style="text-align:center">/</td>
<td style="text-align:center"><a href="./v11/extensions/smlar.html"><Badge type="tip" text="V11 / v1.1.35-" vertical="top" /></a></td>
</tr>
</tbody>
</table>
3 changes: 2 additions & 1 deletion docs/zh/features/v11/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# 内核增强功能
# 自研功能

- [高性能](./performance/README.md)
- [高可用](./availability/README.md)
- [安全](./security/README.md)
- [弹性跨机并行查询(ePQ)](./epq/README.md)
- [第三方插件](./extensions/README.md)
4 changes: 4 additions & 0 deletions docs/zh/features/v11/extensions/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# 第三方插件

- [pgvector](./pgvector.md) <Badge type="tip" text="V11 / v1.1.35-" vertical="top" />
- [smlar](./smlar.md) <Badge type="tip" text="V11 / v1.1.28-" vertical="top" />
81 changes: 81 additions & 0 deletions docs/zh/features/v11/extensions/pgvector.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
---
author: 山现
date: 2023/12/25
minute: 10
---

# pgvector

<Badge type="tip" text="V11 / v1.1.35-" vertical="top" />

<ArticleInfo :frontmatter=$frontmatter></ArticleInfo>

[[toc]]

## 背景

[`pgvector`](https://github.com/pgvector/pgvector) 作为一款高效的向量数据库插件,基于 PostgreSQL 的扩展机制,利用 C 语言实现了多种向量数据类型和运算算法,同时还能够高效存储与查询以向量表示的 AI Embedding。

`pgvector` 支持 IVFFlat 索引。IVFFlat 索引能够将向量空间分为若干个划分区域,每个区域都包含一些向量,并创建倒排索引,用于快速地查找与给定向量相似的向量。IVFFlat 是 IVFADC 索引的简化版本,适用于召回精度要求高,但对查询耗时要求不严格(100ms 级别)的场景。相比其他索引类型,IVFFlat 索引具有高召回率、高精度、算法和参数简单、空间占用小的优势。

`pgvector` 插件算法的具体流程如下:

1. 高维空间中的点基于隐形的聚类属性,按照 K-Means 等聚类算法对向量进行聚类处理,使得每个类簇有一个中心点
2. 检索向量时首先遍历计算所有类簇的中心点,找到与目标向量最近的 n 个类簇中心
3. 遍历计算 n 个类簇中心所在聚类中的所有元素,经过全局排序得到距离最近的 k 个向量

## 使用方法

`pgvector` 可以顺序检索或索引检索高维向量,关于索引类型和更多参数介绍可以参考插件源代码的 [README](https://github.com/pgvector/pgvector/blob/master/README.md)。

### 安装插件

```sql:no-line-numbers
CREATE EXTENSION vector;
```

### 向量操作

执行如下命令,创建一个含有向量字段的表:

```sql:no-line-numbers
CREATE TABLE t (val vector(3));
```

执行如下命令,可以插入向量数据:

```sql:no-line-numbers
INSERT INTO t (val) VALUES ('[0,0,0]'), ('[1,2,3]'), ('[1,1,1]'), (NULL);
```

创建 IVFFlat 类型的索引:

1. `val vector_ip_ops` 表示需要创建索引的列名为 `val`,并且使用向量操作符 `vector_ip_ops` 来计算向量之间的相似度。该操作符支持向量之间的点积、余弦相似度、欧几里得距离等计算方式
2. `WITH (lists = 1)` 表示使用的划分区域数量为 1,这意味着所有向量都将被分配到同一个区域中。在实际应用中,划分区域数量需要根据数据规模和查询性能进行调整

```sql:no-line-numbers
CREATE INDEX ON t USING ivfflat (val vector_ip_ops) WITH (lists = 1);
```

计算近似向量:

```sql:no-line-numbers
=> SELECT * FROM t ORDER BY val <#> '[3,3,3]';
val
---------
[1,2,3]
[1,1,1]
[0,0,0]

(4 rows)
```

### 卸载插件

```sql:no-line-numbers
DROP EXTENSION vector;
```

## 注意事项

- [ePQ](../epq/README.md) 支持通过排序遍历高维向量,不支持通过索引查询向量类型
Loading