Replace MEM_MD5_DIGEST with generic MEM_16B_BUF #1874

yadij · 2024-07-30T07:21:13Z

No description provided.

... with generic 16 byte buffer.

rousskov · 2024-07-31T21:33:41Z

src/store_key_md5.cc

@@ -138,7 +138,7 @@ storeKeyPublicByRequestMethod(HttpRequest * request, const HttpRequestMethod& me
 cache_key *
 storeKeyDup(const cache_key * key)
 {
-    cache_key *dup = (cache_key *)memAllocate(MEM_MD5_DIGEST);
+    cache_key *dup = (cache_key *)memAllocBuf(SQUID_MD5_DIGEST_LENGTH, nullptr);


MEM_MD5_DIGEST pool should be removed, but it should not be replaced with another pool. Instead, cache_key should not be dynamically allocated at all! The key should become a cheap-to-create/copy/compare/destroy class based on something like two uint64_t integer data members. We may even have an old TODO about this long-awaited improvement somewhere...

I am not blocking this PR on these "wrong MEM_MD5_DIGEST replacement" grounds because there may be performance value in introducing a generic 16-byte pool (as discussed elsewhere).

src/mem/old_api.cc

rousskov · 2024-07-31T22:03:42Z

src/mem/old_api.cc

@@ -310,8 +315,6 @@ Mem::Init(void)
    // TODO: Carefully stop zeroing these objects memory and drop the doZero parameter
    memDataInit(MEM_DREAD_CTRL, "dread_ctrl", sizeof(dread_ctrl), 0, true);
    memDataInit(MEM_DWRITE_Q, "dwrite_q", sizeof(dwrite_q), 0, true);
-    memDataInit(MEM_MD5_DIGEST, "MD5 digest", SQUID_MD5_DIGEST_LENGTH, 0, true);


Introducing smaller generic pools may improve or harm performance. It may harm performance if we happen to have some frequently used context (e.g., in HTTP header parsing) where a short SBuf often grows from below 16 bytes to below 32 bytes -- introducing one extra memory allocation/copy. We should not make such performance-sensitive/focused changes without a pressing need and without performance testing!

I recommend keeping MEM_MD5_DIGEST pool until cache_key is upgraded (as discussed elsewhere). Instead, let's just set its doZero parameter to false so that we can make progress towards removing that pool parameter/functionality as described in the above TODO (line 310/315).

Initial testing indicates that almost all dynamic buffers start with >1KB allocations (as expected fro I/O buffers), even those which initialize with a 0-byte/missing buffer.

Largest use of these 16B buffers is Store MD5 objects as expected. Plus a few hundred global string constants (eg header names, constants) that are stored in SBuf or MemBuf and do not reallocate to anything larger.

Initial testing indicates that almost all dynamic buffers start with >1KB allocations

Data from two busy production Squids suggests the opposite conclusion (and supports concerns behind this change request): Short strings (i.e. 40 bytes or shorter) are responsible for the vast majority of relevant allocations and exceed, say, 4KB buffer allocations by two orders of magnitude (e.g., 17,267,013 vs. 142,112 allocations).

Here is a sample from one worker showing dominance of small buffer allocations -- allocations that may be sensitive to this PR changes as detailed in this change request:

Pool Obj Size Allocated (#) Allocated (%)

MD5 digest 16 6,618 0%

Short Strings 40 17,267,013 90%

Medium Strings 128 736,958 4%

Long Strings 512 389,430 2%

1KB Strings 1024 2,376 0%

2K Buffer 2048 2,945 0%

4KB Strings 4096 142,112 1%

4K Buffer 4096 371 0%

8K Buffer 8192 151 0%

16KB Strings 16384 454,272 2%

16K Buffer 16384 98 0%

32K Buffer 32768 267 0%

64K Buffer 65536 87,965 0%

Please do not change buffers used for short strings in this PR. We can make progress without such risky and effectively untested performance-sensitive changes (as detailed in this change request).

It is worth noting that the "Short Strings" stats you are looking at were merged into the "64B Buffer" pool (not even 32B pool) in current Squid where my analysis was done. Many of them also were String data-copies by header parsers. Those disappeared when SBuf started share a larger underlying MemBlob I/O buffer and/or reference to the RegisteredHeader global list.

Would the admin of that busy cache you have access to be willing to patch in a counter of how many times memAllocBuf() is called with size under 17 ?

It is worth noting that the "Short Strings" stats you are looking at were merged into the "64B Buffer" pool (not even 32B pool) in current Squid where my analysis was done.

I am aware that master/v7 code has a different set of pools than v6 and earlier code (due to 2023 commit 250fd42, at least), but those differences do not affect this change request analysis AFAICT.

Many of them also were String data-copies by header parsers. Those disappeared when SBuf started share a larger underlying MemBlob I/O buffer and/or reference to the RegisteredHeader global list.

I do not have enough information to validate the implication that header parsing improvements were enough to make potential negative side effects of this PR changes negligible.

Would the admin of that busy cache you have access to be willing to patch in a counter of how many times memAllocBuf() is called with size under 17 ?

Patching those Squids to investigate this PR side effects is not an option right now. Fortunately, we can make very good progress (including merging this PR!) by focusing on changes leading to doZero parameter (and associated unwanted functionality) removal.

In fact, even if we had enough anecdotal evidence suggesting that current PR changes do not hurt performance, it would still be prudent to extract/isolate those performance-sensitive changes into a dedicated PR!

rousskov · 2024-12-04T15:21:20Z

@yadij added the S-waiting-for-QA label

FWIW, I do not know why that label was added, but I recommend resolving PR merge conflicts as the next step towards all-green CI tests.

Maintenance: Replace MEM_MD5_DIGEST memory pool

832f1e5

... with generic 16 byte buffer.

kinkie approved these changes Jul 31, 2024

View reviewed changes

rousskov changed the title ~~Maintenance: Replace MEM_MD5_DIGEST memory pool~~ Replace MEM_MD5_DIGEST with generic MEM_16B_BUF Jul 31, 2024

rousskov requested changes Jul 31, 2024

View reviewed changes

rousskov added the S-waiting-for-author author action is expected (and usually required) label Jul 31, 2024

yadij added S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box and removed S-waiting-for-author author action is expected (and usually required) labels Dec 2, 2024

yadij requested a review from rousskov December 2, 2024 23:40

yadij added the S-waiting-for-QA QA team action is needed (and usually required) label Dec 2, 2024

rousskov added S-waiting-for-author author action is expected (and usually required) and removed S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box labels Dec 4, 2024

rousskov removed their request for review December 4, 2024 15:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace MEM_MD5_DIGEST with generic MEM_16B_BUF #1874

Replace MEM_MD5_DIGEST with generic MEM_16B_BUF #1874

yadij commented Jul 30, 2024 •

edited by rousskov

Loading

rousskov Jul 31, 2024

rousskov Jul 31, 2024

yadij Nov 24, 2024

rousskov Nov 26, 2024

yadij Dec 2, 2024

rousskov Dec 4, 2024

rousskov commented Dec 4, 2024

Pool	Obj Size	Allocated (#)	Allocated (%)
MD5 digest	16	6,618	0%
Short Strings	40	17,267,013	90%
Medium Strings	128	736,958	4%
Long Strings	512	389,430	2%
1KB Strings	1024	2,376	0%
2K Buffer	2048	2,945	0%
4KB Strings	4096	142,112	1%
4K Buffer	4096	371	0%
8K Buffer	8192	151	0%
16KB Strings	16384	454,272	2%
16K Buffer	16384	98	0%
32K Buffer	32768	267	0%
64K Buffer	65536	87,965	0%

Replace MEM_MD5_DIGEST with generic MEM_16B_BUF #1874

Are you sure you want to change the base?

Replace MEM_MD5_DIGEST with generic MEM_16B_BUF #1874

Conversation

yadij commented Jul 30, 2024 • edited by rousskov Loading

rousskov Jul 31, 2024

Choose a reason for hiding this comment

rousskov Jul 31, 2024

Choose a reason for hiding this comment

yadij Nov 24, 2024

Choose a reason for hiding this comment

rousskov Nov 26, 2024

Choose a reason for hiding this comment

yadij Dec 2, 2024

Choose a reason for hiding this comment

rousskov Dec 4, 2024

Choose a reason for hiding this comment

rousskov commented Dec 4, 2024

yadij commented Jul 30, 2024 •

edited by rousskov

Loading