Skip to content
schmiddy edited this page Oct 16, 2012 · 3 revisions

pg_reorg -- Reorganize tables in a PostgreSQL database without holding locks.

Synopsis

pg_reorg [OPTIONS]

The following options can be specified in OPTIONS. See also "Options" for details.

Reorg Options

  • -o [--order-by] columns [,...]
  • -n [--no-order]
  • -t [--table] table
  • -T [--wait-timeout] seconds
  • -Z [--no-analyze]

Connection Options

  • -a, --all : reorganize all databases
  • -d, --dbname=DBNAME : database to connect
  • -h, --host=HOSTNAME : database server host or socket directory
  • -p, --port=PORT : database server port
  • -U, --username=USERNAME : user name to connect as
  • -w, --no-password : never prompt for password
  • -W, --password : force password prompt

Generic Options

  • -e, --echo : echo queries
  • -E, --elevel=LEVEL : set output message level
  • --help : show the help, then exit
  • --version : output version information, then exit

Description

pg_reorg is an utility program to reorganize tables in PostgreSQL databases. Unlike clusterdb, it doesn't block concurrent DML (i.e. SELECTs or UPDATEs) of the table while it is being compacted. You can choose one of the following methods to reorganize.

  • Online CLUSTER (ordered by cluster index)
  • Ordered by specified columns
  • Online VACUUM FULL (packing rows only)

NOTE:

  • Only superusers can use the utility.
  • Target table must have PRIMARY KEY.

Examples

The following invocation performs an online CLUSTER of all tables in the test database.

$ pg_reorg test

The following invocation performs an online VACUUM FULL of table foo in the test database.

$ pg_reorg --no-order --table foo -d test

Options

pg_reorg has the following command line options:

Reorg Options

Options to order rows. Only one such option may be specified. If not specified, pg_reorg performs an online CLUSTER using the table's declared CLUSTER ON index. Also, options to specify target tables or databases.

-n
--no-order
Perform online VACUUM FULL.
-o columns [,...]
--order-by=columns [,...]
Perform online CLUSTER ordered by specified columns.
-t table
--table=table
Reorganize the specified table only. If you don't specify this option, all tables in the specified databases are reorganized.
-T seconds
--wait-timeout=seconds
pg_reorg needs to take an exclusive lock at the end of the reorganization. This setting controls how many seconds it will wait to acquire this lock. If the lock cannot be taken even after this duration, pg_reorg resorts to canceling conflicting queries using pg_cancel_backend(). On Postgres versions 8.4 or newer, pg_reorg forcibly kills any remaining conflicted backends using pg_terminate_backend() after twice this timeout passed. The default is 60 seconds.
-Z
--no-analyze
Disable an ANALYZE of the table after the reorganization. If not specified, ANALYZE is performed by default.

Connection Options

Options to connect to servers. You cannot use --all together with --dbname or --table.

-a
--all
Reorganize all databases.
-d dbname
--dbname dbname
Specifies the name of the database to be reorganized. If this is not specified and -a (or --all) is not used, the database name is read from the environment variable PGDATABASE. If that is not set, the user name specified for the connection is used.
-h host
--host host
Specifies the host name of the machine on which the server is running. If the value begins with a slash, it is used as the directory for the Unix domain socket.
-p port
--port port
Specifies the TCP port or local Unix domain socket file extension on which the server is listening for connections.
-U username
--username username
User name to connect as.
-w
--no-password
Never issue a password prompt. If the server requires password authentication and a password is not available by other means such as a .pgpass file, the connection attempt will fail. This option can be useful in batch jobs and scripts where no user is present to enter a password.
-W
--password
Force the program to prompt for a password before connecting to a database.
This option is never essential, since the program will automatically prompt for a password if the server demands password authentication. However, pg_reorg will waste a connection attempt finding out that the server wants a password. In some cases it is worth typing -W to avoid the extra connection attempt.

Generic Options

-e
--echo
Echo commands sent to server.
-E
--elevel
Choose the output message level from DEBUG, INFO, NOTICE, WARNING, ERROR, LOG, FATAL, and PANIC. The default is INFO.
--help
Show usage of the program.
--version
Show the version number of the program.

Environment

PGDATABASE
PGHOST
PGPORT
PGUSER
Default connection parameters

This utility, like most other PostgreSQL utilities, also uses the environment variables supported by libpq (see Environment Variables).

Diagnostics

Error messages are reported when pg_reorg fails. The following list shows the cause of errors.

You may need to uninstall and reinstall pg_reorg by hand after fatal errors, in order to clean up transient data left behind. On versions 1.1.7 or earlier, execute $PGHOME/share/contrib/uninstall_pg_reorg.sql in the database where the error occured and then execute $PGHOME/share/contrib/pg_reorg.sql. On later versions, a simple

DROP EXTENSION pg_reorg;
CREATE EXTENSION pg_reorg;

should suffice.

pg_reorg : reorg database "template1" ... skipped

  pg_reorg is not installed in the database when option --all is specified.
  Register pg_reorg in the database.

ERROR: pg_reorg is not installed

   pg_reorg is not installed in the database specified by --dbname.
  Register pg_reorg in the database.

ERROR: relation "table" has no primary key

  The target table doesn't have PRIMARY KEY.
Define a PRIMARY KEY on the table, i.e. ALTER TABLE ADD PRIMARY KEY)

ERROR: relation "table" has no cluster key   The target table doesn't have CLUSTER KEY.
  Define a CLUSTER key on the table. (ALTER TABLE ... CLUSTER ON ... )

pg_reorg : query failed: ERROR: column "col" does not exist   The target table doesn't have columns specified by --order-by option.   Specify existing columns.

ERROR: permission denied for schema reorg   Permission error.
  pg_reorg must be executed by superusers.

pg_reorg : query failed: ERROR: trigger "z_reorg_trigger" for relation "tbl" already exists

  The target table already has a trigger named "z_reorg_trigger", perhaps left over from a previous failed pg_reorg run.
  Delete or rename the trigger.

pg_reorg : trigger conflicted for tbl

  The target table already has a trigger which follows by "z_reorg_trigger" in alphabetical order, and pg_reorg expects its trigger to be last in alphabetical order so that it executes last.
  Delete or rename the trigger.

Restrictions

pg_reorg has the following restrictions. Be careful of the warnings about DDL which may cause data corruption.

Temp tables

pg_reorg cannot reorganize temp tables.

GiST indexes

pg_reorg cannot cluster a table based on a GiST index

DDL commands

You cannot safely perform DDL (except VACUUM or ANALYZE) on the table while pg_reorg is working. In many cases pg_reorg will fail and rollback correctly, but there are some cases where corruption of your table may occur.

TRUNCATE

    TRUNCATE is lost after pg_reorg completes, and the deleted rows will be restored.

CREATE INDEX

    Causes index corruption.

ALTER TABLE ... ADD COLUMN

    Causes lost data. Newly added columns are initialized with NULLs.

ALTER TABLE ... ALTER COLUMN TYPE

    Causes data corruption.

ALTER TABLE ... SET TABLESPACE

    Causes data corruption with wrong relfilenode.

Details

pg_reorg creates a working table in the reorg schema and sorts rows in this table. After it has finished, it updates the system catalogs directly to swap the working table and the original.

Installation

pg_reorg can be built with "make" on UNIX, Linux, or OS X. The pgxs build framework is used automatically. Before building, you might need to install the Postgres developer packages (e.g. postgresql-devel) and ensure pg_config is in your $PATH.

$ cd pg_reorg
$ make
$ su
$ make install

You can also use Microsoft Visual C++ 2010 to build the program on Windows. There are project files in the msvc folder.

Start PostgreSQL and execute the script to register functions to your database.

$ pg_ctl start
$ psql -f $PGSHARE/contrib/pg_reorg.sql -d your_database

NOTE: CREATE EXTENSION is supported only in versions 1.1.8 or later.

Requirements

PostgreSQL versions

   PostgreSQL 8.3 or later

OS

   RHEL 5.2, Windows XP SP3, OS X 10.6+

Disks

   Requires free disk space twice as large as the target table(s) and indexes. For example, if the total size of the tables and indexes to be reorganized is 1GB, an additional 2GB of disk space is required.

Releases

  • 1.1.7 (2011-08-07)
  • Bugfix: VIEWs and FUNCTIONs could be corrupted that used a reorganized table which has a dropped column.
  • Supports PostgreSQL 9.1 and 9.2dev, but not CREATE EXTENSION, yet.