-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow to create unlogged IMMVs #1
base: main
Are you sure you want to change the base?
Conversation
il y a aussi une typo là Line 113 in 89a36db
|
19f91db
to
4a1124d
Compare
Merci ! Corrigée et squashée ! Mais je crois que je vais m'arrêter là ... je veux empêcher maître Capello de passer sur tous les fichiers du projet pour corriger leurs anciennes typos 🙂 |
4a1124d
to
726a766
Compare
pg_ivm--1.0.sql
Outdated
@@ -17,7 +17,7 @@ SELECT pg_catalog.pg_extension_config_dump('pg_catalog.pg_ivm_immv', ''); | |||
|
|||
-- functions | |||
|
|||
CREATE FUNCTION create_immv(text, text) | |||
CREATE FUNCTION create_immv(text, text, boolean default true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Je pense que ce fichier est immuable !
La logique de migration voudrait que ta fonction avec la signature modifiée soit entièrement redéclarée dans un fichier pg_ivm--1.8--1.9.sql
, avec :
CREATE OR REPLACE FUNCTION create_immv(text, text, boolean default true)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah mais oui...
Je n'avais pas fait attention, mais ces fichiers correspondent effectivement aux releases de pg_ivm...
Il est aussi à référencer dans https://github.com/mediapart/pg_ivm/blob/main/Makefile#L17
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bien vu @coox !!! ... et j'ai honte de ne pas y avoir fait attention car, comme je vous en parlais sur Mattermost, un ou deux jours avant d'ouvrir la PR (que j'avais en fait préparée depuis assez longtemps), j'avais aussi identifié le côté immuable de ces fichiers en cherchant le fameux event trigger de nettoyage de la table pg_catalog.pg_ivm_immv
qui avait été remplacé dans la version 1.3 par des object_access_hook ayant nécessité cette suppression.
README.md
Outdated
@@ -84,12 +84,18 @@ When `pg_ivm` is installed, the following objects are created. | |||
|
|||
Use `create_immv` function to create IMMV. | |||
``` | |||
create_immv(immv_name text, view_definition text) RETURNS bigint | |||
create_immv(immv_name text, view_definition text [, is_immv_logged bool]) RETURNS bigint |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C'est drôle parce qu'en tant qu'utilisateur de pg_ivm, et connaissant ton intention, je me serais attendu à une fonction munie de la logique inverse (car reprenant la terminologie PostgreSQL).
Et aussi, je pense que c'est important de référencer dans les docs la valeur par défaut.
Ça donnerait quelque chose comme :
create_immv(immv_name text, view_definition text [, unlogged bool DEFAULT false]) RETURNS bigint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
C'est drôle parce qu'en tant qu'utilisateur de pg_ivm, et connaissant ton intention, je me serais attendu à une fonction munie de la logique inverse (car reprenant la terminologie PostgreSQL).
Et aussi, je pense que c'est important de référencer dans les docs la valeur par défaut.
Ça donnerait quelque chose comme :
create_immv(immv_name text, view_definition text [, unlogged bool DEFAULT false]) RETURNS bigint
Très bonne remarque de l'amiral !!!
Pour l’historique, cette version contraire à la logique et sans la valeur par défaut dans la doc est due à ce que j'expliquais à David dans ce commentaire. Et ayant cru dans un premier temps à tord que je devais ajouter la nouvelle fonction à côté de la précédente (et non à la place), et estimant qu'elle était spécifiquement là pour la création d'IMMVs unlogged, ça m'avait paru naturel que true
soit synonyme de "je veux que mon IMMV soit unlogged" (même si en effet ça se discute). Mais c'est vrai qu'au moment de ne garder qu'une seule fonction avec le troisième paramètre non obligatoire, j'aurais dû avoir la rigueur de revoir ça (valeur par défaut à false + l'indiquer dans la doc).
Merci pour ta vigilance 👍
Outre mes deux commentaires ci-dessus, je trouve le reste nickel, les docs sont excellentes, et je suis globalement hyper heureux que cette PR existe (…bientôt dans le repo officiel 😉) 🎉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
je retire mon ✔️ pour le coup de la release ;-)
Merci 😄 |
J'ai poussé un fixup qui prend en compte vos remarques et j'ai mis à jour la description de la PR avec les tests refaits avec la nouvelle version (le comportement du false devient celui du true et réciproquement). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔥
Je pense qu'il faut faire un commit à part qui se nomme |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quelques NIP de typos à appliquer ou pas (rien de méchant).
README.md
Outdated
``` | ||
`create_immv` defines a new IMMV of a query. A table of the name `immv_name` is created and a query specified by `view_definition` is executed and used to populate the IMMV. The query is stored in `pg_ivm_immv`, so that it can be refreshed later upon incremental view maintenance. `create_immv` returns the number of rows in the created IMMV. | ||
|
||
When an IMMV is created, some triggers are automatically created so that the view's contents are immediately updated when its base tables are modified. In addition, a unique index is created on the IMMV automatically if possible. If the view definition query has a GROUP BY clause, a unique index is created on the columns of GROUP BY expressions. Also, if the view has DISTINCT clause, a unique index is created on all columns in the target list. Otherwise, if the IMMV contains all primary key attributes of its base tables in the target list, a unique index is created on these attributes. In other cases, no index is created. | ||
|
||
#### refresh_imm | ||
In some cases (e.g. for very large IMMVs containing volatile data), and only if you really know what you are doing, it may be useful to create an IMMV without writing to the [PostgreSQL Write-Ahead Logs](https://www.postgresql.org/docs/current/wal-intro.html) (the underlying table is created with the `UNLOGGED` parameter). Unlogged tables allow to improve write performance, reduce vacuum impact and produce less WALs (thus reducing network usage, backup size and restoration time). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
allow to improve > improve
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIP :
less > fewer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Celles-ci sont les miennes 😄 Corrigées ! Merci !
@@ -261,7 +267,7 @@ If some base tables have row level security policy, rows that are not visible to | |||
|
|||
IVM is effective when we want to keep an IMMV up-to-date and small fraction of a base table is modified infrequently. Due to the overhead of immediate maintenance, IVM is not effective when a base table is modified frequently. Also, when a large part of a base table is modified or large data is inserted into a base table, IVM is not effective and the cost of maintenance can be larger than refresh from scratch. | |||
|
|||
In such situation, we can use `refesh_immv` function with `with_data = false` to disable immediate maintenance before modifying a base table. After a base table modification, call `refresh_immv`with `with_data = true` to refresh the view data and enable immediate maintenance. | |||
In such situation, we can use `refresh_immv` function with `with_data = false` to disable immediate maintenance before modifying a base table. After a base table modification, call `refresh_immv`with `with_data = true` to refresh the view data and enable immediate maintenance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In such situation > In such a situation
we can use refresh_immv
> we can use the refresh_immv
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merci pour la remarque mais je préfère laisser tel quel car ce ne sont pas des erreurs importantes/gênantes/structurantes et qu'elles ne proviennent pas de mes commits. Dans le 1er commit j'ai bien corrigé des typos qui étaient déjà dans le projet, mais uniquement car elles concernaient des noms de fonctions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dit autrement, je ne voudrais pas trop distraire le(s) mainteneur(s) avec autre chose que le but principal de cette PR, dont je ne suis pas du tout sûr qu'elle soit acceptée 🙂
createas.c
Outdated
* aiming to make it easier to follow up the original code. | ||
*/ | ||
const bool is_ivm = true; | ||
|
||
/* must be a CREATE MATERIALIZED VIEW statement */ | ||
Assert(is_matview); | ||
|
||
/* | ||
* For volatile data, it can be useful not to write to WALs. | ||
* SEE THE POSTGRESQL DOCUMENTATION FOR DRAWBACKS (not replicated or included in physical backups, truncated in case of crash, ...): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of crash > of a crash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idem, prise en compte dans le dernier fixup.
e67f3aa
to
abc7d1f
Compare
Le 🧐 d'or a encore frappé ! Et là encore j'ai honte, parce qu'évidemment quand j'avais commencé à bosser sur le sujet j'avais bien vu tous les commits |
d0d4a91
to
0cd7c28
Compare
❤️ Merci pour vos reviews !
|
J'ai complété les tests qui se trouvent dans la description de la PR (ici et celle du repo d'origine). |
0cd7c28
to
434ff8f
Compare
434ff8f
to
e053a9e
Compare
Description
In some cases (e.g. for very large IMMVs containing volatile data), and provided you are well aware of the risks or drawbacks involved, it may be useful to create an IMMV without writing to PostgreSQL Write-Ahead Logs.
PostgreSQL unlogged tables drawbacks are multiple (not replicated, not backuped, ...) but the main one to keep in mind is that they are automatically truncated after a crash or unclean shutdown. If these conditions meet your requirements, this, as with simple unlogged tables, allows to improve write performance, reduce vacuum impact, and produce fewer WALs (thus reducing network usage, backup size and restoration time).
I hesitated to propose this PR because allowing unlogged IMMVs goes a bit against the fact that PostgreSQL does not allow the creation of unlogged materialized views. But since this feature (based on the fact that IMMVs are, under the hood, classical PostgreSQL tables feeded by triggers) is very useful in our case, I thought that it could also be the same for some other users of this superb extension that is pg_ivm.
Please note that I am not a C developer, and I would obviously be more than happy to consider your comments related to the code... or, of course, the feature itself.
Additionally, I organized the commits based on the project's git history, particularly the Prepare 1.10 commit, without being sure that the PR, if accepted, would merit a version bump. I'll let you judge what's best to do.
Tests
refresh_immv()
usingwith_data = true
refresh_immv()
usingwith_data = false