Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hidden_items_v2 performance optimization #766

Closed
dreamer20150420 opened this issue Sep 2, 2023 · 3 comments
Closed

hidden_items_v2 performance optimization #766

dreamer20150420 opened this issue Sep 2, 2023 · 3 comments

Comments

@dreamer20150420
Copy link

dreamer20150420 commented Sep 2, 2023

Is your feature request related to a problem? Please describe.
all hidden item store in one zset, will cause the zset too large and redis will timeout

Describe the solution you'd like
maybe can use boolm filter , but Collect garbage in cache is an unavoidable problem

@dreamer20150420
Copy link
Author

如果删除的ID数量比较多,导致在Redis中存储的Sorted Set中存在大key,你可以考虑以下优化方法:

  1. 分片存储:将大的Sorted Set拆分为多个小的Sorted Set。按照某种规则(例如ID的哈希值或范围)将帖子ID分配到不同的Sorted Set中。这样可以将数据分散在多个小的Sorted Set中,避免了单个Sorted Set过大的问题。

  2. 定期归档:将过期的帖子ID从当前的Sorted Set中移动到归档的Sorted Set中。可以通过定时任务或后台进程来执行该操作。归档的Sorted Set可以存储在其他地方(例如其他缓存存储、数据库等),或者可以将较旧的帖子ID归档到不同的Sorted Set中,根据实际需求进行调整。

  3. 使用其他数据结构:如果Sorted Set中的元素过多导致性能下降,你可以考虑使用其他更适合的数据结构来存储已删除的帖子ID。例如,可以使用Redis的HyperLogLog数据结构来估计帖子ID的去重数量,并按照某种规则进行删除。

  4. 分布式缓存:如果你需要处理非常大规模的帖子ID数据,并且Redis无法满足性能需求,可以考虑使用分布式缓存系统,如Memcached或Redis Cluster。这些系统能够水平扩展,并提供更高的容量和吞吐量。

需要根据你的具体业务需求和系统环境来选择合适的优化方法。优化方案的选择应该综合考虑数据规模、性能需求、实施复杂度以及可行性等因素。

If the number of deleted post IDs is relatively large, leading to a large key in the Sorted Set stored in Redis, you can consider the following optimization methods:

Sharding: Split the large Sorted Set into multiple smaller ones. Distribute the post IDs to different Sorted Sets based on a certain rule, such as the hash value or range of the post ID. This way, the data will be scattered across multiple smaller Sorted Sets, avoiding the issue of a single Sorted Set becoming too large.

Regular archiving: Move expired post IDs from the current Sorted Set to an archived Sorted Set. This can be done through a scheduled task or a background process. The archived Sorted Set can be stored elsewhere (such as another cache storage, a database, etc.), or you can archive older post IDs into different Sorted Sets based on your needs.

Use alternative data structures: If the large number of elements in the Sorted Set causes performance degradation, you can consider using other more suitable data structures to store the deleted post IDs. For example, you can use Redis’ HyperLogLog data structure to estimate the cardinality of the post IDs and perform deletions based on certain rules.

Distributed caching: If you need to handle a very large-scale post ID data and Redis cannot meet the performance requirements, you may consider using a distributed caching system like Memcached or Redis Cluster. These systems can scale horizontally and provide higher capacity and throughput.

The choice of optimization methods should be based on your specific business requirements and system environment. The selection should consider factors such as data scale, performance needs, implementation complexity, and feasibility.

@zhenghaoz
Copy link
Collaborator

0.5将优化推荐缓存的存储,代价是Redis不再被支持,需要使用RediSearch

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿


0.5 will optimize the storage of recommended caches at the cost of Redis no longer being supported and RediSearch required.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants