Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: accelerate the creation of the consumer cache #11840

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

xuruidong
Copy link
Contributor

Description

In my case, there are approximately 17,000+ consumers. Any update from a consumer will trigger a long-tail request. I found the methods consumer.plugin and consumer.consumers_kv cost too much time.

Test steps

Update one consumer, send a request. Repeat 5 times.

Test result:

Before optimization:
unit: second

consumer.plugin consumer.consumers_kv
0.192824 3.209864
0.178377 2.837157
0.156748 3.573214
0.152296 2.720143
0.161006 2.530867

After optimization:
unit: second

consumer.plugin consumer.consumers_kv
0.050382 0.036338
0.021982 0.016851
0.023093 0.017966
0.022090 0.016934

Fixes # (issue)

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. performance generate flamegraph for the current PR labels Dec 18, 2024
@praswicaksono
Copy link

To purge local cache need to restart apisix?

@membphis
Copy link
Member

this is the core reason whey the new way is better performance.

use the cached data, reduce to recall secret.fetch_secrets

image

I think we can use lrucache, which is easier way, here is the code example:

local lrucache = core.lrucache.new({
    ttl = 300, count = 5000  -- we can change the `count` as we need, in your case, we may change it to more than 17,000+
})

local new_consumer, err = lrucache(consumer, nil,
                    secret.fetch_secrets, consumer.auth_conf, false)

I can not 100% confirm if it works, welcome to make a test in this way, many thx

@xuruidong
Copy link
Contributor Author

this is the core reason whey the new way is better performance.

use the cached data, reduce to recall secret.fetch_secrets

image I think we can use `lrucache`, which is easier way, here is the code example:
local lrucache = core.lrucache.new({
    ttl = 300, count = 5000  -- we can change the `count` as we need, in your case, we may change it to more than 17,000+
})

local new_consumer, err = lrucache(consumer, nil,
                    secret.fetch_secrets, consumer.auth_conf, false)

I can not 100% confirm if it works, welcome to make a test in this way, many thx

Let me make the ci happy first, and then I'll try your suggestion.

Signed-off-by: xuruidong <[email protected]>
@xuruidong
Copy link
Contributor Author

ping @membphis

conf_version = consumers.conf_version
}
end

local cached_consumer = consumers_id_cache[val.value.id]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not know why we have to add consumers_id_cache

it seems useless

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of using consumer_id_cache is to avoid unnecessary table copies. I will refactor this part of the code to replace the table with an LRU cache.

apisix/consumer.lua Outdated Show resolved Hide resolved
apisix/consumer.lua Outdated Show resolved Hide resolved
@xuruidong
Copy link
Contributor Author

To purge local cache need to restart apisix?

yes

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Dec 27, 2024
@@ -101,46 +152,21 @@ local function plugin_consumer()
if not plugins[name] then
plugins[name] = {
nodes = {},
len = 0,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of introducing the len variable here is to reduce the redundant computation of the table length. Refer to the table.insert section in this link: https://api7.ai/learning-center/openresty/lua-table-and-metatable

@xuruidong
Copy link
Contributor Author

The code introduces a variable consumers_count_for_lrucache, which is used to set the size of the LRU cache. In actual use, this value needs to be configured to be greater than the actual number of consumers. Additionally, if a consumer has multiple authentication plugins, each of them will be considered as a separate consumer and cached individually in the LRU cache.

When the number of consumers exceeds twice the size of the LRU cache, the process of creating cache during the execution of the consumer.plugin function may result in cache misses, leading to performance degradation. This happens because the LRU cache can only store consumers within a single window, and each time a consumer is updated, the order of the consumer IDs in the consumers.values array is likely to either remain the same or change very little. As a result, new consumer objects are continuously created, causing the old consumer cache entries to be evicted.


代码中引入了一个变量 consumers_count_for_lrucache , 用来设置 lrucache 的大小。实际使用的时候,这个值需要配置为大于 consumer 的实际数量,并且如果一个consumer 中存在多个认证插件,则会被认为是不同的consumer 缓存到 lrucache 中。

当 consumer 数量大于2倍的lrucache 大小的时候,在执行 consumer.plugin 函数创建缓存的过程中可能会一直不能命中缓存而导致性能下降。因为 lrucache 只能缓存一个窗口内的 consumer,而每次 consumer 有更新时,consumers.values 数组中的 consumer id 的顺序大概率可能不变或者变化很小,此时会一直存在生成新的 consumer object ,淘汰旧的 consumer 缓存。

@xuruidong
Copy link
Contributor Author

ping @membphis

})

local function fill_consumer_secret(consumer)
local new_consumer = core.table.clone(consumer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think we should use deep_clone?
is it safer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think deep-copy is not necessary. The consumer in the function fill_consumer_secret(consumer) is created in the construct_consumer_data function and is a copy of an element from the consumers.values list.

@xuruidong xuruidong requested a review from membphis January 7, 2025 02:05
Copy link
Member

@membphis membphis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will check it carefully next week, I am busy this week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance generate flamegraph for the current PR size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants