Skip to content

Data Mapping in PL Python

Manas Sivakumar edited this page Aug 27, 2022 · 4 revisions

When a PL/Python function is called:

  • PostgreSQL boolean is converted to Python bool.

  • PostgreSQL smallint and int are converted to Python int. PostgreSQL bigint and oid are converted to long in Python 2 and to int in Python 3.

  • PostgreSQL real and double are converted to Python float.

  • PostgreSQL numeric is converted to Python Decimal. This type is imported from the cdecimal package if that is available. Otherwise, decimal.Decimal from the standard library will be used. cdecimal is significantly faster than decimal. In Python 3.3 and up, however, cdecimal has been integrated into the standard library under the name decimal, so there is no longer any difference.

  • PostgreSQL bytea is converted to Python str in Python 2 and to bytes in Python 3. In Python 2, the string should be treated as a byte sequence without any character encoding.

  • All other data types, including the PostgreSQL character string types, are converted to a Python str. In Python 2, this string will be in the PostgreSQL server encoding; in Python 3, it will be a Unicode string like all strings.

When a PL/Python function returns:

  • When the PostgreSQL return type is boolean, the return value will be evaluated for truth according to the Python rules. That is, 0 and empty string are false, but notably 'f' is true.

  • When the PostgreSQL return type is bytea, the return value will be converted to a string (Python 2) or bytes (Python 3) using the respective Python built-ins, with the result being converted to bytea.

  • For all other PostgreSQL return types, the return value is converted to a string using the Python built-in str, and the result is passed to the input function of the PostgreSQL data type. (If the Python value is a float, it is converted using the repr built-in instead of str, to avoid loss of precision.)

  • Strings in Python 2 are required to be in the PostgreSQL server encoding when they are passed to PostgreSQL. Strings that are not valid in the current server encoding will raise an error, but not all encoding mismatches can be detected, so garbage data can still result when this is not done correctly. Unicode strings are converted to the correct encoding automatically, so it can be safer and more convenient to use those. In Python 3, all strings are Unicode strings.

Note that logical mismatches between the declared PostgreSQL return type and the Python data type of the actual return object are not flagged; the value will be converted in any case.

Null, None

  • postgreSQL NULL is converted to python None.
  • for more reasonable behaviour we could add STRICT variable: if a null value is passed, the function will not be called at all.