Skip to content

pengrongbo/pg_index

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 

Repository files navigation

Optimizing LIKE and ILIKE Queries and Index Usage in PostgreSQL

In PostgreSQL database design, effective optimization of LIKE and ILIKE queries requires careful index planning, especially when dealing with large datasets and complex queries. This guide provides best practices based on field type and data volume to help you design efficient queries and index structures in different scenarios.

1. Text Fields Suitable for LIKE and ILIKE Queries

Different lengths of text fields have varying performance characteristics, and the appropriate query method should be chosen based on the actual scenario:

  • Short text fields (< 255 characters): Fields like usernames, product names, and tags are well-suited for LIKE and ILIKE queries. With shorter fields, the query overhead is lower, providing better performance.
  • Medium-length text fields (255-1000 characters): For fields like descriptions or summaries, adding indexes significantly improves query performance, especially with larger datasets.
  • Long text fields (> 1000 characters): For long text fields such as article content or product details, LIKE and ILIKE queries are not recommended as performance will degrade significantly. In these cases, specialized full-text search tools like PostgreSQL's built-in full-text search or external services like Elasticsearch should be considered.

2. When to Add Indexes: Based on Row Count and Query Type

To optimize the performance of LIKE and ILIKE queries, adding appropriate indexes is essential. The timing for adding indexes should depend on the number of rows in the table and the query patterns:

  • Small datasets (< 1000 rows): For small datasets, LIKE and ILIKE queries generally do not need special optimization as the performance is usually acceptable.
  • Medium-sized datasets (1000 to 10,000 rows): When the table grows to thousands of rows, frequent LIKE and ILIKE queries can lead to performance degradation. At this stage, you should choose appropriate indexes based on the query pattern:
    • Prefix matching queries (LIKE 'term%'): Use B-tree indexes for optimization. B-tree indexes are suitable for prefix matching queries but cannot handle complex fuzzy searches or ILIKE queries.
    • Complex fuzzy queries (%term% or _term%): Since B-tree indexes are ineffective for searches with wildcards, enabling the pg_trgm (trigram) extension and using GIN or GiST indexes can significantly improve performance. pg_trgm breaks down strings into trigrams and can accelerate fuzzy matching queries effectively.
  • Large datasets (> 10,000 rows): For larger datasets, the performance of LIKE and ILIKE queries may degrade significantly, especially with fuzzy matching. In these cases, it is recommended to use the pg_trgm extension and leverage GIN or GiST indexes to optimize the query speed for complex pattern matching queries like %term%.

3. Choosing the Right Index Type

  • B-tree Indexes: Suitable for prefix matching LIKE queries (such as LIKE 'term%'), but ineffective for %term% queries with wildcards. Additionally, B-tree indexes do not support ILIKE queries, as ILIKE ignores case, and B-tree does not handle case-insensitive searches.
  • GIN Indexes with pg_trgm Extension: Using GIN indexes with the pg_trgm extension can significantly accelerate complex fuzzy queries, especially when wildcards appear in the middle or at the beginning of the string.
  • GiST Indexes with pg_trgm Extension: Similar to GIN indexes, GiST indexes can handle complex fuzzy queries, but they generally perform better for range queries and full-text search scenarios.

Examples and Tests:

For detailed examples of enabling the pg_trgm extension, creating indexes, and optimizing queries, please refer to the examples and test documentation in the relevant subdirectories of the project.

Conclusion:

In PostgreSQL, the optimization of LIKE and ILIKE queries depends on data volume, query patterns, and the length of text fields. B-tree indexes are suitable for simple prefix matching queries, while complex fuzzy queries and case-insensitive ILIKE searches require the pg_trgm extension combined with GIN or GiST indexes. For large datasets, effectively utilizing indexes and extensions will significantly improve query performance and ensure stability and efficiency in high-concurrency, complex query scenarios.

About

Fuzzy match/full text search performance

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published