hidden_items_v2 performance optimization #766

dreamer20150420 · 2023-09-02T13:44:44Z

Is your feature request related to a problem? Please describe.
all hidden item store in one zset， will cause the zset too large and redis will timeout

Describe the solution you'd like
maybe can use boolm filter , but Collect garbage in cache is an unavoidable problem

dreamer20150420 · 2023-09-02T23:50:21Z

如果删除的ID数量比较多，导致在Redis中存储的Sorted Set中存在大key，你可以考虑以下优化方法：

分片存储：将大的Sorted Set拆分为多个小的Sorted Set。按照某种规则（例如ID的哈希值或范围）将帖子ID分配到不同的Sorted Set中。这样可以将数据分散在多个小的Sorted Set中，避免了单个Sorted Set过大的问题。
定期归档：将过期的帖子ID从当前的Sorted Set中移动到归档的Sorted Set中。可以通过定时任务或后台进程来执行该操作。归档的Sorted Set可以存储在其他地方（例如其他缓存存储、数据库等），或者可以将较旧的帖子ID归档到不同的Sorted Set中，根据实际需求进行调整。
使用其他数据结构：如果Sorted Set中的元素过多导致性能下降，你可以考虑使用其他更适合的数据结构来存储已删除的帖子ID。例如，可以使用Redis的HyperLogLog数据结构来估计帖子ID的去重数量，并按照某种规则进行删除。
分布式缓存：如果你需要处理非常大规模的帖子ID数据，并且Redis无法满足性能需求，可以考虑使用分布式缓存系统，如Memcached或Redis Cluster。这些系统能够水平扩展，并提供更高的容量和吞吐量。

需要根据你的具体业务需求和系统环境来选择合适的优化方法。优化方案的选择应该综合考虑数据规模、性能需求、实施复杂度以及可行性等因素。

If the number of deleted post IDs is relatively large, leading to a large key in the Sorted Set stored in Redis, you can consider the following optimization methods:

Sharding: Split the large Sorted Set into multiple smaller ones. Distribute the post IDs to different Sorted Sets based on a certain rule, such as the hash value or range of the post ID. This way, the data will be scattered across multiple smaller Sorted Sets, avoiding the issue of a single Sorted Set becoming too large.

Regular archiving: Move expired post IDs from the current Sorted Set to an archived Sorted Set. This can be done through a scheduled task or a background process. The archived Sorted Set can be stored elsewhere (such as another cache storage, a database, etc.), or you can archive older post IDs into different Sorted Sets based on your needs.

Use alternative data structures: If the large number of elements in the Sorted Set causes performance degradation, you can consider using other more suitable data structures to store the deleted post IDs. For example, you can use Redis’ HyperLogLog data structure to estimate the cardinality of the post IDs and perform deletions based on certain rules.

Distributed caching: If you need to handle a very large-scale post ID data and Redis cannot meet the performance requirements, you may consider using a distributed caching system like Memcached or Redis Cluster. These systems can scale horizontally and provide higher capacity and throughput.

The choice of optimization methods should be based on your specific business requirements and system environment. The selection should consider factors such as data scale, performance needs, implementation complexity, and feasibility.

zhenghaoz · 2024-11-14T03:41:47Z

0.5将优化推荐缓存的存储，代价是Redis不再被支持，需要使用RediSearch

Issues-translate-bot · 2024-11-14T03:41:58Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

0.5 will optimize the storage of recommended caches at the cost of Redis no longer being supported and RediSearch required.

zhenghaoz closed this as completed Nov 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hidden_items_v2 performance optimization #766

hidden_items_v2 performance optimization #766

dreamer20150420 commented Sep 2, 2023 •

edited

Loading

dreamer20150420 commented Sep 2, 2023

zhenghaoz commented Nov 14, 2024

Issues-translate-bot commented Nov 14, 2024

hidden_items_v2 performance optimization #766

hidden_items_v2 performance optimization #766

Comments

dreamer20150420 commented Sep 2, 2023 • edited Loading

dreamer20150420 commented Sep 2, 2023

zhenghaoz commented Nov 14, 2024

Issues-translate-bot commented Nov 14, 2024

dreamer20150420 commented Sep 2, 2023 •

edited

Loading