Skip to content

Conversation

@wumeibanfa
Copy link
Contributor

@wumeibanfa wumeibanfa commented Oct 21, 2025

What problem does this PR solve?

modify to_base64, from_base64, unhex to avoid a extra copy

Issue Number: close #xxx

Related PR: #xxx

Problem Summary:

The format is method/a/b, where a denotes the number of lines and b denotes the length of each line's string.

Run on (24 X 2395.45 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x12)
  L1 Instruction 32 KiB (x12)
  L2 Unified 1024 KiB (x12)
  L3 Unified 32768 KiB (x1)
Load Average: 3.32, 2.73, 2.74
--------------------------------------------------------------------------
Benchmark                                Time             CPU   Iterations
--------------------------------------------------------------------------
BM_ToBase64Impl_Old/1000/256        151122 ns       151123 ns         4623
BM_ToBase64Impl_Old/100/65536       801007 ns       800983 ns          785
BM_ToBase64Impl_Old/10/100000       115405 ns       115405 ns         6024
BM_ToBase64Impl_New/1000/256        126745 ns       126732 ns         5512
BM_ToBase64Impl_New/100/65536       449138 ns       449122 ns         1304
BM_ToBase64Impl_New/10/100000        54417 ns        54373 ns        12773
BM_FromBase64Impl_Old/1000/256       95666 ns        95666 ns         7315
BM_FromBase64Impl_Old/100/65536     750223 ns       750193 ns          867
BM_FromBase64Impl_Old/10/100000     113150 ns       113146 ns         6115
BM_FromBase64Impl_New/1000/256       79121 ns        79121 ns         8847
BM_FromBase64Impl_New/100/65536     522309 ns       521026 ns         1214
BM_FromBase64Impl_New/10/100000      78207 ns        78205 ns         8929
BM_UnhexImpl_Old/1000/256            13058 ns        13058 ns        53759
BM_UnhexImpl_Old/100/65536            1328 ns         1327 ns       484917
BM_UnhexImpl_Old/100/100000           1319 ns         1319 ns       529219
BM_UnhexImpl_New/1000/256             6045 ns         6023 ns       116043
BM_UnhexImpl_New/100/65536             656 ns          656 ns      1056606
BM_UnhexImpl_New/100/100000            710 ns          710 ns       984401
BM_UnhexNullImpl_Old/1000/256         6413 ns         6413 ns       109417
BM_UnhexNullImpl_Old/100/65536         686 ns          681 ns      1054879
BM_UnhexNullImpl_Old/100/100000        725 ns          725 ns       957391
BM_UnhexNullImpl_New/1000/256         6144 ns         6144 ns       114054
BM_UnhexNullImpl_New/100/65536         664 ns          664 ns       981229
BM_UnhexNullImpl_New/100/100000        710 ns          708 ns       982551

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@Thearas
Copy link
Contributor

Thearas commented Oct 21, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@zclllyybb zclllyybb self-assigned this Oct 21, 2025
@wumeibanfa
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

ClickBench: Total hot run time: 28.33 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit e2b7d2cd229e6a07be6d9a3be41d3375e08de584, data reload: false

query1	0.05	0.05	0.04
query2	0.11	0.06	0.05
query3	0.26	0.08	0.08
query4	1.61	0.12	0.13
query5	0.28	0.26	0.25
query6	1.19	0.66	0.66
query7	0.03	0.04	0.03
query8	0.06	0.05	0.05
query9	0.65	0.54	0.52
query10	0.58	0.58	0.58
query11	0.16	0.12	0.12
query12	0.17	0.13	0.12
query13	0.65	0.63	0.62
query14	1.03	1.02	1.02
query15	0.87	0.86	0.89
query16	0.40	0.40	0.41
query17	1.08	1.05	1.06
query18	0.22	0.20	0.20
query19	1.91	1.87	1.81
query20	0.02	0.01	0.01
query21	15.42	0.16	0.13
query22	5.18	0.07	0.05
query23	15.66	0.26	0.11
query24	2.89	0.89	0.27
query25	0.08	0.06	0.06
query26	0.14	0.13	0.14
query27	0.07	0.06	0.06
query28	4.42	1.16	0.94
query29	12.62	3.99	3.24
query30	0.30	0.15	0.12
query31	2.84	0.61	0.38
query32	3.24	0.56	0.48
query33	3.17	3.06	3.10
query34	16.11	5.48	4.92
query35	4.98	5.01	4.93
query36	0.70	0.52	0.50
query37	0.10	0.07	0.07
query38	0.07	0.04	0.04
query39	0.03	0.03	0.03
query40	0.19	0.16	0.14
query41	0.09	0.03	0.03
query42	0.04	0.02	0.02
query43	0.04	0.03	0.04
Total cold run time: 99.71 s
Total hot run time: 28.33 s

@doris-robot
Copy link

BE UT Coverage Report

Increment line coverage 97.10% (67/69) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.60% (17898/34026)
Line Coverage 37.85% (162391/429029)
Region Coverage 32.27% (123824/383764)
Branch Coverage 33.67% (54254/161125)

Copy link
Contributor

@zclllyybb zclllyybb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add:

  1. performance compare result
  2. testcase for non-ASCII input(Chinese and more), hex and unhex

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 97.10% (67/69) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.48% (23844/33359)
Line Coverage 57.87% (248146/428781)
Region Coverage 52.90% (205630/388722)
Branch Coverage 54.70% (88627/162020)

@wumeibanfa
Copy link
Contributor Author

run buildall

@wumeibanfa
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-DS: Total hot run time: 189841 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit edefa460665c1e748ba4d03b0ca0fdf8ba49fb09, data reload: false

query1	1033	459	407	407
query2	6618	1704	1675	1675
query3	6762	219	224	219
query4	26910	23838	22945	22945
query5	5111	656	481	481
query6	365	257	234	234
query7	4679	498	305	305
query8	336	272	281	272
query9	8735	2620	2558	2558
query10	555	376	301	301
query11	16285	15075	14902	14902
query12	193	123	118	118
query13	1696	564	450	450
query14	12730	9323	9209	9209
query15	253	187	175	175
query16	7767	672	498	498
query17	1578	795	736	736
query18	2954	450	346	346
query19	301	222	203	203
query20	149	134	138	134
query21	229	154	130	130
query22	4561	4642	4644	4642
query23	34689	34040	34030	34030
query24	8587	2512	2607	2512
query25	617	562	477	477
query26	1326	279	172	172
query27	2772	544	379	379
query28	4374	2264	2234	2234
query29	832	652	554	554
query30	352	241	226	226
query31	980	821	788	788
query32	86	80	78	78
query33	613	385	337	337
query34	818	856	540	540
query35	796	825	743	743
query36	937	1017	904	904
query37	129	115	90	90
query38	3517	3588	3520	3520
query39	1513	1397	1428	1397
query40	218	127	121	121
query41	65	58	61	58
query42	121	116	114	114
query43	483	504	468	468
query44	1259	747	737	737
query45	190	178	173	173
query46	901	1000	648	648
query47	1723	1745	1710	1710
query48	406	427	320	320
query49	771	492	418	418
query50	686	708	441	441
query51	3912	3975	3853	3853
query52	109	108	105	105
query53	239	283	195	195
query54	604	586	527	527
query55	102	86	88	86
query56	329	338	310	310
query57	1156	1184	1116	1116
query58	295	287	275	275
query59	2542	2603	2555	2555
query60	361	326	343	326
query61	197	155	151	151
query62	818	720	679	679
query63	236	203	205	203
query64	4452	1166	845	845
query65	4068	3979	3980	3979
query66	1049	433	331	331
query67	15219	14899	14856	14856
query68	7516	886	597	597
query69	480	344	334	334
query70	1312	1247	1257	1247
query71	407	346	318	318
query72	5980	4853	4842	4842
query73	532	607	363	363
query74	9168	8862	8674	8674
query75	3438	3347	2855	2855
query76	2773	1149	759	759
query77	532	427	316	316
query78	9597	9551	8848	8848
query79	2048	839	601	601
query80	747	590	513	513
query81	540	262	290	262
query82	413	166	137	137
query83	279	276	257	257
query84	259	120	106	106
query85	876	473	424	424
query86	382	325	301	301
query87	3678	3714	3670	3670
query88	3024	2284	2266	2266
query89	401	330	287	287
query90	2075	231	229	229
query91	159	164	131	131
query92	84	71	66	66
query93	2048	998	638	638
query94	721	463	294	294
query95	412	327	320	320
query96	495	572	282	282
query97	2929	2945	2866	2866
query98	238	210	218	210
query99	1374	1396	1307	1307
Total cold run time: 280166 ms
Total hot run time: 189841 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 28.65 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit edefa460665c1e748ba4d03b0ca0fdf8ba49fb09, data reload: false

query1	0.05	0.05	0.04
query2	0.10	0.05	0.06
query3	0.26	0.09	0.09
query4	1.61	0.11	0.12
query5	0.28	0.27	0.25
query6	1.20	0.66	0.65
query7	0.03	0.03	0.03
query8	0.05	0.04	0.04
query9	0.65	0.53	0.52
query10	0.59	0.58	0.58
query11	0.16	0.11	0.12
query12	0.15	0.12	0.12
query13	0.64	0.61	0.61
query14	1.02	1.04	1.03
query15	0.87	0.89	0.87
query16	0.40	0.39	0.41
query17	1.06	1.06	1.04
query18	0.21	0.20	0.20
query19	1.99	1.84	1.83
query20	0.02	0.01	0.01
query21	15.41	0.17	0.12
query22	5.16	0.08	0.05
query23	15.72	0.27	0.10
query24	3.16	1.16	0.73
query25	0.12	0.06	0.06
query26	0.13	0.14	0.13
query27	0.06	0.06	0.05
query28	5.12	1.13	0.94
query29	12.56	3.93	3.25
query30	0.30	0.14	0.11
query31	2.82	0.62	0.39
query32	3.25	0.55	0.48
query33	3.14	3.05	3.14
query34	16.23	5.44	4.84
query35	4.89	4.94	4.91
query36	0.70	0.52	0.50
query37	0.10	0.07	0.07
query38	0.06	0.05	0.04
query39	0.04	0.04	0.03
query40	0.19	0.16	0.14
query41	0.09	0.03	0.03
query42	0.04	0.04	0.03
query43	0.05	0.03	0.03
Total cold run time: 100.68 s
Total hot run time: 28.65 s

@wumeibanfa
Copy link
Contributor Author

run beut

@wumeibanfa
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

ClickBench: Total hot run time: 28.59 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 14d0790094dde766487483b7688aa295c791623d, data reload: false

query1	0.05	0.05	0.05
query2	0.09	0.05	0.05
query3	0.27	0.08	0.08
query4	1.61	0.12	0.12
query5	0.29	0.26	0.26
query6	1.19	0.67	0.65
query7	0.03	0.03	0.03
query8	0.06	0.04	0.05
query9	0.63	0.53	0.52
query10	0.58	0.58	0.58
query11	0.16	0.11	0.12
query12	0.16	0.13	0.12
query13	0.64	0.61	0.61
query14	1.04	1.05	1.02
query15	0.88	0.89	0.87
query16	0.42	0.40	0.40
query17	1.08	1.08	1.04
query18	0.23	0.21	0.22
query19	1.93	1.90	1.85
query20	0.02	0.02	0.02
query21	15.41	0.18	0.12
query22	5.07	0.08	0.05
query23	15.65	0.27	0.11
query24	2.69	0.69	0.39
query25	0.07	0.06	0.06
query26	0.15	0.13	0.14
query27	0.06	0.05	0.07
query28	4.39	1.18	0.93
query29	12.57	3.91	3.27
query30	0.31	0.14	0.13
query31	2.83	0.62	0.41
query32	3.24	0.57	0.48
query33	3.11	3.22	3.10
query34	16.28	5.48	4.90
query35	4.91	4.92	4.93
query36	0.72	0.52	0.51
query37	0.10	0.07	0.07
query38	0.06	0.05	0.04
query39	0.04	0.03	0.03
query40	0.18	0.15	0.15
query41	0.09	0.04	0.04
query42	0.04	0.03	0.03
query43	0.04	0.04	0.04
Total cold run time: 99.37 s
Total hot run time: 28.59 s

@hello-stephen
Copy link
Contributor

BE UT Coverage Report

Increment line coverage 97.10% (67/69) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 52.60% (17898/34026)
Line Coverage 37.85% (162388/429040)
Region Coverage 32.27% (123839/383767)
Branch Coverage 33.67% (54248/161129)

@wumeibanfa
Copy link
Contributor Author

run cloud_p0

Copy link
Contributor

@zclllyybb zclllyybb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Oct 22, 2025
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@hello-stephen
Copy link
Contributor

BE Regression && UT Coverage Report

Increment line coverage 97.10% (67/69) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 71.43% (23830/33359)
Line Coverage 57.83% (247991/428792)
Region Coverage 52.96% (205867/388725)
Branch Coverage 54.65% (88542/162024)

@zclllyybb zclllyybb changed the title [fix](function) modify to_base64, from_base64, unhex to avoid a extra copy [optimize](function) modify to_base64, from_base64, unhex to avoid a extra copy Oct 22, 2025
@zclllyybb zclllyybb merged commit 5a9cb5a into apache:master Oct 22, 2025
27 of 29 checks passed
dwdwqfwe pushed a commit to dwdwqfwe/doris that referenced this pull request Oct 24, 2025
…extra copy (apache#57182)

The format is method/a/b, where a denotes the number of lines and b
denotes the length of each line's string.

```text
Run on (24 X 2395.45 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x12)
  L1 Instruction 32 KiB (x12)
  L2 Unified 1024 KiB (x12)
  L3 Unified 32768 KiB (x1)
Load Average: 3.32, 2.73, 2.74
--------------------------------------------------------------------------
Benchmark                                Time             CPU   Iterations
--------------------------------------------------------------------------
BM_ToBase64Impl_Old/1000/256        151122 ns       151123 ns         4623
BM_ToBase64Impl_Old/100/65536       801007 ns       800983 ns          785
BM_ToBase64Impl_Old/10/100000       115405 ns       115405 ns         6024
BM_ToBase64Impl_New/1000/256        126745 ns       126732 ns         5512
BM_ToBase64Impl_New/100/65536       449138 ns       449122 ns         1304
BM_ToBase64Impl_New/10/100000        54417 ns        54373 ns        12773
BM_FromBase64Impl_Old/1000/256       95666 ns        95666 ns         7315
BM_FromBase64Impl_Old/100/65536     750223 ns       750193 ns          867
BM_FromBase64Impl_Old/10/100000     113150 ns       113146 ns         6115
BM_FromBase64Impl_New/1000/256       79121 ns        79121 ns         8847
BM_FromBase64Impl_New/100/65536     522309 ns       521026 ns         1214
BM_FromBase64Impl_New/10/100000      78207 ns        78205 ns         8929
BM_UnhexImpl_Old/1000/256            13058 ns        13058 ns        53759
BM_UnhexImpl_Old/100/65536            1328 ns         1327 ns       484917
BM_UnhexImpl_Old/100/100000           1319 ns         1319 ns       529219
BM_UnhexImpl_New/1000/256             6045 ns         6023 ns       116043
BM_UnhexImpl_New/100/65536             656 ns          656 ns      1056606
BM_UnhexImpl_New/100/100000            710 ns          710 ns       984401
BM_UnhexNullImpl_Old/1000/256         6413 ns         6413 ns       109417
BM_UnhexNullImpl_Old/100/65536         686 ns          681 ns      1054879
BM_UnhexNullImpl_Old/100/100000        725 ns          725 ns       957391
BM_UnhexNullImpl_New/1000/256         6144 ns         6144 ns       114054
BM_UnhexNullImpl_New/100/65536         664 ns          664 ns       981229
BM_UnhexNullImpl_New/100/100000        710 ns          708 ns       982551
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants