[Proposal] Refactor ORCA to support user-defined access method in pg #113

wfnuser · 2023-08-02T15:23:07Z

wfnuser
Aug 2, 2023

Proposers

@wfnuser [Qinghao Huang]

Proposal Status

Under discussion

Abstract

Introduce a novel infrastructure to enable ORCA's support for any user-defined access method.

Motivation

Just as Postgres supports user-defined access methods through the creation of extensions, it is imperative for ORCA to follow suit.

Background

In Postgres, we possess a well-structured abstraction of an access method that enables us to accommodate any user-defined access approach. For introducing novel index types like the 'bloom index', developers simply need to implement the API structure 'IndexAmRoutine' and establish the new index access method as an extension. Subsequently, Postgres can automatically produce associated index scan paths within the optimizer and invoke the method for the fresh index type implemented by the extension provider if the path is selected in the executor.
However, this narrative takes a different turn when it comes to ORCA.
Firstly, all the index types supported in ORCA are predefined within its implementation. Any time we are translating relcache to dxl or evaluating the applicability of an index, these processes are hardcoded into the related logic.

		/* Definition */
		enum EmdindexType
		{
		    EmdindBtree,   // btree
		    EmdindBitmap,  // bitmap
		    EmdindGist,    // gist using btree or bitmap
		    EmdindGin,     // gin using btree or bitmap
		    EmdindBrin,    // brin
		    EmdindHash,    // hash
		    EmdindSentinel
		};
		
		/* Example 1 */
		switch (index_rel->rd_rel->relam)
		{
			case BTREE_AM_OID:
				index_type = IMDIndex::EmdindBtree;
				break;
			case BITMAP_AM_OID:
				index_type = IMDIndex::EmdindBitmap;
				break;
			case BRIN_AM_OID:
				index_type = IMDIndex::EmdindBrin;
				break;
			case GIN_AM_OID:
				index_type = IMDIndex::EmdindGin;
				break;
			case GIST_AM_OID:
				index_type = IMDIndex::EmdindGist;
				break;
			case HASH_AM_OID:
				index_type = IMDIndex::EmdindHash;
				break;
			default:
				GPOS_RAISE(gpdxl::ExmaMD, gpdxl::ExmiMDObjUnsupported,
						   GPOS_WSZ_LIT("Index access method"));
		}
		
		/* Example 2 */
		// index expressions and index constraints not supported
		return gpdb::HeapAttIsNull(tup, Anum_pg_index_indexprs) &&
			   gpdb::HeapAttIsNull(tup, Anum_pg_index_indpred) &&
			   index_rel->rd_index->indisvalid &&
			   (BTREE_AM_OID == index_rel->rd_rel->relam ||
				BITMAP_AM_OID == index_rel->rd_rel->relam ||
				GIST_AM_OID == index_rel->rd_rel->relam ||
				GIN_AM_OID == index_rel->rd_rel->relam ||
				BRIN_AM_OID == index_rel->rd_rel->relam ||
				HASH_AM_OID == index_rel->rd_rel->relam);
		
		/* Example 3 */
		BOOL gin_or_gist_or_brin = (pmdindex->IndexType() == IMDIndex::EmdindGist ||
		                            pmdindex->IndexType() == IMDIndex::EmdindGin ||
		                            pmdindex->IndexType() == IMDIndex::EmdindBrin);		
if (cmptype == IMDType::EcmptNEq || cmptype == IMDType::EcmptIDF ||
		    (cmptype == IMDType::EcmptOther &&
		    !gin_or_gist_or_brin) ||  // only GIN/GiST/BRIN indexes with a comparison type other are ok
		    (gin_or_gist_or_brin &&
		    pexprScalar->Arity() <
		        2))  // we do not support unary index expressions for GIN/GiST/BRIN indexes
		{
		    return nullptr;
		}

Secondly, ORCA operates on a distinct cost model, which implies that the costs computed in the Postgres Optimizer cannot be directly compared with those calculated in ORCA. Therefore, devising a method to compute costs in ORCA for user-defined indexes stands as another challenge that needs to be resolved to facilitate user-defined indexes within ORCA.

Implementation (WIP)

The solution involves multiple components:

PG_AM

To extend support for ORCA and other third-party optimizers, we introduce a new cost estimate function within IndexAmRoutine. This function should be optional. If extension providers wish to enable it for ORCA, they must be responsible for implementing the cost estimate logic. If not provided, there should be no adverse effects on the Postgres Optimizer.

		  typedef struct IndexAmRoutine
		  {
		  	NodeTag		type;
		  	...
		  
		  	/* third-party am cost estimate functions */
		  	AmOrcaCostEstimateFunc amorcacostestimate; /* can be NULL */
		  } IndexAmRoutine;
		  
		  typedef double (*AmOrcaCostEstimateFunc) (CostInfo* ci);

Potentially, modifications may be needed in IndexAmRoutine to provide additional information required by ORCA, though specifics remain undiscovered.
For CostInfo please refer to ###Cost Model (SPIKING)

DXL translation parts (WIP)

Introduce a new index type called UserDefinedIndex for non-predefined indexes.

			      /* Definition */
			      enum EmdindexType
			      {
			          EmdindBtree,   // btree
			          EmdindBitmap,  // bitmap
			          EmdindGist,    // gist using btree or bitmap
			          EmdindGin,     // gin using btree or bitmap
			          EmdindBrin,    // brin
			          EmdindHash,    // hash
			          EmdindUserDefined, // user defined
			          EmdindSentinel
			      };

Remove hardcoded index type checks such as IsIndexSupported. All indexes are supported if they provide the implementation of amorcacostestimate.

Metadata (SPIKING)

FIndexApplicable:

During the evaluation of FIndexApplicable, decisions must be made regarding whether the new index can support Btree or Bitmap indexes.
Determining which types of comparison are supported for different indexes.
Adding fields to indicate the aforementioned considerations.
Keeping a safe plan if certain decisions haven't been made yet.

On the ORCA side, a suitable location needs to be identified for attaching the user-defined cost estimate function. CMDIndexGPDB appears to be a natural choice, since the access method is linked to index relations in Postgres:

			  	CMDIndexGPDB *index = GPOS_NEW(mp) CMDIndexGPDB(
			  		mp, mdid_index, mdname, index_clustered, index_partitioned, index_type,
			  		mdid_item_type, index_key_cols_array, included_cols, op_families_mdids,
			  		nullptr,  // mdpart_constraint
			  		child_index_oids,
			  		index_rel->rd_indam->amorcacostestimate);

This enables us to access the function by using the RetrieveIndex method at any point where we have the relevant MDId.

Cost Model (SPIKING)

Currently, there's a reliance on switch and case style implementation to accommodate distinct cost models for various operators in ORCA. And associating cost models with operators is the direction we're heading. A potential long-term objective might look like the following:

			  CCost
			  CCostModelGPDB::Cost(
			  	CExpressionHandle &exprhdl,	 // handle gives access to expression properties
			  	const SCostingInfo *pci) const
			  {
			  	GPOS_ASSERT(nullptr != pci);
			  	COperator::EOperatorId op_id = exprhdl.Pop()->Eopid();
			  	// All infomation the customized cost estimate function need to know
			  	CostInfo ci = new CostInfo();
			  	return exprhdl.Pop().Cost(ci);
			    
			  	/* CURRENT IMPLEMENTATION */
			  	// switch (op_id)
			  	// {
			  	// 	default:
			  	// 	{
			  	// 		// FIXME: macro this?
			  	// 		__builtin_unreachable();
			  	// 	}
			  	// 	case COperator::EopPhysicalTableScan:
			  	// 	case COperator::EopPhysicalDynamicTableScan:
			  	// 	case COperator::EopPhysicalExternalScan:
			  	// 	{
			  	// 		return CostScan(m_mp, exprhdl, this, pci);
			  	// 	}
			  
			  	// 	case COperator::EopPhysicalFilter:
			  	// 	{
			  	// 		return CostFilter(m_mp, exprhdl, this, pci);
			  	// 	}
			  
			  	// 	case COperator::EopPhysicalIndexOnlyScan:
			  	// 	{
			  	// 		return CostIndexOnlyScan(m_mp, exprhdl, this, pci);
			  	// 	}
			  	// } 
			  }

However, this is a substantial change that cannot be implemented in a single commit. The initial step would be to support two operators (PhysicalIndexScan and PhysicalTableScan) in a polymorphism style, not switch case style.
Initial implementation could look like:

			  CCost
			  CCostModelGPDB::Cost(
			  	CExpressionHandle &exprhdl,	 // handle gives access to expression properties
			  	const SCostingInfo *pci) const
			  {
			  	GPOS_ASSERT(nullptr != pci);
			  
			  	COperator::EOperatorId op_id = exprhdl.Pop()->Eopid();
			  
			  	if (op_id == COperator::EopPhysicalIndexScan)
			  	{
			  		CPhysicalIndexScan *pop = (CPhysicalIndexScan*) exprhdl.Pop();
			  		CMDAccessor *md_accessor = COptCtxt::PoctxtFromTLS()->Pmda();
			  		const IMDIndex *pmdindex = md_accessor->RetrieveIndex(pop->Pindexdesc()->MDId());
			  		if (pmdindex->OrcaCostEsitmate() != NULL)
			  		{
			  			return CostUserDefinedIndex(m_mp, exprhdl, this, pci, pmdindex->OrcaCostEsitmate());
			  		}
			  	}
			  }
			  
			  // main job for this function is to create CostInfo for this particular operator
			  // the info for different operator seems to be different
			  CCost
			  CCostModelGPDB::CostUserDefinedIndex(CMemoryPool *,  // mp
			  							  CExpressionHandle &exprhdl,
			  							  const CCostModelGPDB *pcmgpdb,
			  							  const SCostingInfo *pci,
			  							  AmOrcaCostEstimateFunc cef)
			  {
			  	COperator *pop = exprhdl.Pop();
			  	CostInfo ci;
			  
			  	/* ORCA_AM_TODO: the way to get table width is currently related to pop. */
			  	ci.dTableWidth =
			  		CPhysicalScan::PopConvert(pop)->PstatsBaseTable()->Width().Get();
			  
			  	ci.dIndexFilterCostUnit =
			  		pcmgpdb->GetCostModelParams()
			  			->PcpLookup(CCostModelParamsGPDB::EcpIndexFilterCostUnit)
			  			->Get().Get();
			  	ci.dIndexScanTupCostUnit =
			  		pcmgpdb->GetCostModelParams()
			  			->PcpLookup(CCostModelParamsGPDB::EcpIndexScanTupCostUnit)
			  			->Get().Get();
			  	ci.dIndexScanTupRandomFactor =
			  		pcmgpdb->GetCostModelParams()
			  			->PcpLookup(CCostModelParamsGPDB::EcpIndexScanTupRandomFactor)
			  			->Get().Get();
			  
			  	GPOS_ASSERT(0 < ci.dIndexFilterCostUnit);
			  	GPOS_ASSERT(0 < ci.dIndexScanTupCostUnit);
			  	GPOS_ASSERT(0 < ci.dIndexScanTupRandomFactor);
			  
			  	ci.dRowsIndex = pci->Rows();
			  	ci.dNumRebinds = pci->NumRebinds();
			  
			  	ci.ulIndexKeys = CPhysicalIndexScan::PopConvert(pop)->Pindexdesc()->Keys();
			  
			  	return CCost(cef(&ci));
			  }

It's also worth noting that the CostInfo structure should be meticulously designed to ensure that it's universally applicable across different operators. We intend to maintain this structure in the PostgreSQL codebase using the C language. By doing so, ORCA and other third-party optimizers can rely on it as a dependable reference. An illustrative sample structure is as follows:

			  typedef struct CostInfo
			  {
			      double dTableWidth;
			      double dIndexFilterCostUnit;
			      double dIndexScanTupCostUnit;
			      double dIndexScanTupRandomFactor;
			      double dRowsIndex;
			      double dNumRebinds;
			      uint32_t ulIndexKeys;
			  } CostInfo;

Cost Weight (Unspecified)

Determining the cost weight for a specific operator remains uncertain. Decisions should be based on existing methodologies, but we don't have enough info about it.

		  double
		  hashorcacostestimate(CostInfo* ci)
		  {
		  	// We don't need random IO cost. But we have some other costs. How to decide it.
		  	double dCostPerIndexRow = ci->dTableWidth * ci->dIndexScanTupCostUnit;
		  	return ci->dNumRebinds *
		  				 (ci->dRowsIndex * dCostPerIndexRow);
		  }

This paper - Counting, Enumerating, and Sampling of Execution Plans
in a Cost-Based Query Optimizer might give us some clues.

Property Enforcement (Unspecified)

Currently, only btscan has the order property among indexes. For other index scans, we can assume that they don't differ in properties from table scans.

Bitmap Index Scan & Index Only Scan (TODO)

To facilitate the integration of bitmap index scans and index-only scans, it's imperative to introduce specific flags that allow customized index providers to determine whether they are capable of supporting these types of scans.

Presently, the plan generated by ORCA appears to include information only for higher-level operators such as the bitmap heap scan, while the cost information pertaining to lower-level bitmap index scans seems to be absent. This discrepancy needs to be addressed in order to provide a comprehensive view of the plan's cost estimation, especially concerning the individual index scans and their associated costs.

postgres=# explain select * from t1 where a = 97 or a = 98;
                                    QUERY PLAN
-----------------------------------------------------------------------------------
 Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..431.51 rows=1898 width=4)
   ->  Bitmap Heap Scan on t1  (cost=0.00..431.48 rows=633 width=4)
         Recheck Cond: ((a = 97) OR (a = 98))
         ->  BitmapOr  (cost=0.00..0.00 rows=0 width=0)
               ->  Bitmap Index Scan on t1_a_idx  (cost=0.00..0.00 rows=0 width=0)
                     Index Cond: (a = 97)
               ->  Bitmap Index Scan on t1_a_idx  (cost=0.00..0.00 rows=0 width=0)
                     Index Cond: (a = 98)
 Optimizer: Pivotal Optimizer (GPORCA)
(9 rows)

postgres=# drop index t1_a_idx;
DROP INDEX
postgres=# explain select * from t1 where a = 97 or a = 98;
                                       QUERY PLAN
----------------------------------------------------------------------------------------
 Gather Motion 3:1  (slice1; segments: 3)  (cost=0.00..431.51 rows=1898 width=4)
   ->  Bitmap Heap Scan on t1  (cost=0.00..431.48 rows=633 width=4)
         Recheck Cond: ((a = 97) OR (a = 98))
         ->  BitmapOr  (cost=0.00..0.00 rows=0 width=0)
               ->  Bitmap Index Scan on hash_index_t1  (cost=0.00..0.00 rows=0 width=0)
                     Index Cond: (a = 97)
               ->  Bitmap Index Scan on hash_index_t1  (cost=0.00..0.00 rows=0 width=0)
                     Index Cond: (a = 98)
 Optimizer: Pivotal Optimizer (GPORCA)
(9 rows)

TESTs (WIP)

Relevant tests must be developed and added to validate the changes.

Alternatives

Utilizing user-defined functions within SQL as opposed to linking functions to IndexAmRoutine.
Initially retrieving the Index, then associating ORCACostEstimateFunc with CExpression and CGroupExpression, and triggering it from expressions during cost computation, rather than re-retrieving the index.
Introducing new operators for UserDefinedIndexScan and UserDefinedTableScan, deviating from the use of the same operators as previously.
Developing an extension by informing developers about ORCA information and making it a dependency. This could allow direct injection of serialized cost models written in C++ to Postgres. If this approach proves unfeasible, referencing certain header files from ORCA might still be possible (so that we can use cost weights from orca directly).

Related issues

#80

Are you willing to submit a PR?

Yes I am willing to submit a PR!

Answered by wfnuser

Aug 22, 2023

"establish the amhandler with the supplied Orca cost estimate function"
Do you mean bind orca cost estimate function to IndexAmRoutine (which is what we want to avoid)? I didn't get it. Can you explain it more specifically?

And I just discussed with @my-ship-it , he think it's okay to bind amhandler with orca cost estimate funcion (maybe an orca wrapper).
Let me have a try on it first.

View full answer

wfnuser · 2023-08-15T10:02:39Z

wfnuser
Aug 15, 2023
Author

Following our discussions, it is imperative to devise a method of registering the cost estimation function for ORCA without establishing any dependence of PostgreSQL on ORCA, given that ORCA functions as a plugin within the PostgreSQL framework.

6 replies

wfnuser Aug 17, 2023
Author

do the register in customized _PG_INIT
do the register by providing and calling register function (provided ingp_optimizer_function.c) in extension SQL (like bloom--1.0.sql)

wfnuser Aug 22, 2023
Author

After a bit of exploration, I discovered that in a custom index scenario, it's quite challenging to access the method within _PG_INIT. Here's an illustrative example of how to generate a custom index access method.

/* contrib/bloom/bloom--1.0.sql */

-- complain if script is sourced in psql, rather than via CREATE EXTENSION
\echo Use "CREATE EXTENSION bloom" to load this file. \quit

CREATE FUNCTION blhandler(internal)
RETURNS index_am_handler
AS 'MODULE_PATHNAME'
LANGUAGE C;

-- Access method
CREATE ACCESS METHOD bloom TYPE INDEX HANDLER blhandler;
COMMENT ON ACCESS METHOD bloom IS 'bloom index access method';

-- Opclasses

CREATE OPERATOR CLASS int4_ops
DEFAULT FOR TYPE int4 USING bloom AS
	OPERATOR	1	=(int4, int4),
	FUNCTION	1	hashint4(int4);

CREATE OPERATOR CLASS text_ops
DEFAULT FOR TYPE text USING bloom AS
	OPERATOR	1	=(text, text),
	FUNCTION	1	hashtext(text);

As evident, the creation of the access method can only happen after the establishment of the amhandler. However, the loading of the shared library transpires during the creation of amhandler, specifically when we invoke _PG_INIT.

While we can utilize the OID of blhandler to link the index access method with the supplied Orca cost estimate function, this approach is not only cumbersome but also counterintuitive.

At the moment, I lean more towards the second approach.

hw118118 Aug 22, 2023

Indeed, this approach is cumbersome, but you can do this as these steps for user-defined am:

establish the amhandler with the supplied Orca cost estimate function(at the same time, you can get oid for new user-defined am by get_index_am_oid method);
create extension (like bloom--1.0.sql， include opclass，opgroup etc)
use as noraml intenal am.

so i dont think its counterintuitive, maybe it's difficult for the establishment of opclass and opgroups , but you can write perl script to implement.

wfnuser Aug 22, 2023
Author

"establish the amhandler with the supplied Orca cost estimate function"
Do you mean bind orca cost estimate function to IndexAmRoutine (which is what we want to avoid)? I didn't get it. Can you explain it more specifically?

And I just discussed with @my-ship-it , he think it's okay to bind amhandler with orca cost estimate funcion (maybe an orca wrapper).
Let me have a try on it first.

Answer selected by wfnuser

hw118118 Aug 23, 2023

I mean you can define a interface : typedef double( * cost_fuc)(void * ) ;

then you can assign the interface to the IndexAmRoutine or the exsit element of IndexAmRoutine(method 1);
or export PGDLLIMPORT the interface (method 2, like hook);

and in the orca, you can cast void * to the stuct (like CostInfo* that you mentioned), compute cost;

I don't know I made it clearly.

wfnuser Aug 23, 2023
Author

Agreed. In fact, that's precisely what I accomplished. It's possible the documentation lacks clarity. If you're interested, you can take a look at my draft implementation (method 1).

I'll investigate PGDLLIMPORT to gather additional details. I'm not fully acquainted with the extension mechanism in PostgreSQL at this point. My intention is to enroll the function in ORCA's memory hashmap, achieved by implementing it in _PG_init by extension developers. Does this align with your thoughts on the matter?

wfnuser · 2023-08-22T07:34:01Z

wfnuser
Aug 22, 2023
Author

Additionally, I've come across another challenge that needs addressing if we intend to utilize _PG_INIT or extension SQL to invoke a yet-to-be-implemented registration function in Orca. The issue lies in the fact that for the native access method, there is, in reality, no _PG_INIT or extension SQL provided. This indicates that achieving uniform support for both native and user-defined index access methods might be challenging. This situation somewhat favors the original proposal in certain aspects.

Need some advice.

0 replies

wfnuser · 2023-08-22T11:11:15Z

wfnuser
Aug 22, 2023
Author

Quick update.

Using static struct to register native index am works. Decoupling succeed. @my-ship-it @hw118118

#include "gpopt/utils/CAccessMethodRegistry.h"

std::unordered_map<gpos::ULONG, AmOrcaCostEstimateFunc> gpos::CAccessMethodRegistry::costFunctionRegistry;

void gpos::CAccessMethodRegistry::RegisterCostEstimateFunction(ULONG oid, AmOrcaCostEstimateFunc func) {
    costFunctionRegistry[oid] = func;
}

AmOrcaCostEstimateFunc gpos::CAccessMethodRegistry::GetCostEstimateFunction(ULONG oid) {
    auto it = costFunctionRegistry.find(oid);
    if (it != costFunctionRegistry.end()) {
        return it->second;
    }
    return nullptr;
}

namespace {
    double hashorcacostestimate(CostInfo* ci)
    {
        double dCostPerIndexRow = ci->dTableWidth * ci->dIndexScanTupCostUnit;
        return ci->dNumRebinds *
                    (ci->dRowsIndex * dCostPerIndexRow);
    }

    struct CostFunctionRegistration {
        CostFunctionRegistration() {
            gpos::CAccessMethodRegistry::RegisterCostEstimateFunction(331, hashorcacostestimate);
        }
    };

    static CostFunctionRegistration registration;
}

The next step in the process is to utilize an extension for the purpose of registering the user-defined index access method during the creation of the extension. Another point to address is the strategy for managing the removal of an extension. It seems that leveraging the _PG_FINI function could be a viable solution in this context.

0 replies

wfnuser · 2023-08-23T10:30:46Z

wfnuser
Aug 23, 2023
Author

Thanks @my-ship-it and @hw118118 for the code review and guidance. I'm currently in the midst of transitioning out and may also be taking a break for a while. Nevertheless, if anyone is interested, feel free to reach out to me for discussions. Once I have more time in the future, I might continue exploring this issue in my spare time.

0 replies

tuhaihe · 2023-09-07T08:49:18Z

tuhaihe
Sep 7, 2023
Collaborator

Hi @wfnuser thanks for your proposal! Your proposal is named No.2 and you can access it through the following URL: https://github.com/cloudberrydb/community/blob/main/proposals/cp-2/cp-2.md.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] Refactor ORCA to support user-defined access method in pg #113

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments 6 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

Select a reply

[Proposal] Refactor ORCA to support user-defined access method in pg #113

wfnuser Aug 2, 2023

Proposers

Proposal Status

Abstract

Motivation

Background

Implementation (WIP)

PG_AM

DXL translation parts (WIP)

Metadata (SPIKING)

Cost Model (SPIKING)

Cost Weight (Unspecified)

Property Enforcement (Unspecified)

Bitmap Index Scan & Index Only Scan (TODO)

TESTs (WIP)

Alternatives

Related issues

Are you willing to submit a PR?

Replies: 5 comments · 6 replies

wfnuser Aug 15, 2023 Author

wfnuser Aug 17, 2023 Author

wfnuser Aug 22, 2023 Author

hw118118 Aug 22, 2023

wfnuser Aug 22, 2023 Author

hw118118 Aug 23, 2023

wfnuser Aug 23, 2023 Author

wfnuser Aug 22, 2023 Author

wfnuser Aug 22, 2023 Author

wfnuser Aug 23, 2023 Author

tuhaihe Sep 7, 2023 Collaborator

wfnuser
Aug 2, 2023

Replies: 5 comments 6 replies

wfnuser
Aug 15, 2023
Author

wfnuser Aug 17, 2023
Author

wfnuser Aug 22, 2023
Author

wfnuser Aug 22, 2023
Author

wfnuser Aug 23, 2023
Author

wfnuser
Aug 22, 2023
Author

wfnuser
Aug 22, 2023
Author

wfnuser
Aug 23, 2023
Author

tuhaihe
Sep 7, 2023
Collaborator