Skip to content

[opt](memory) avoid retaining full segment key bounds buffers#63968

Merged
sollhui merged 2 commits into
apache:masterfrom
sollhui:opt_key_bound_mem2
Jun 4, 2026
Merged

[opt](memory) avoid retaining full segment key bounds buffers#63968
sollhui merged 2 commits into
apache:masterfrom
sollhui:opt_key_bound_mem2

Conversation

@sollhui

@sollhui sollhui commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Related PR: #63469

Problem Summary:

#63469 truncates segment key bounds before storing segment statistics, but the current implementation first copies the full KeyBoundsPB and then calls resize() on the protobuf string fields.

For very long keys, resize() reduces the visible string size but may keep the original large string capacity. After the truncated SegmentStatistics is moved into _segid_statistics_map, the rowset writer can still retain buffers sized for the original full key bounds.

This PR changes the write path to build the stored SegmentStatistics with freshly assigned truncated key bound strings, avoiding the full-copy-then-resize pattern. The segcompaction segment stats path is updated in the same way.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@sollhui sollhui requested a review from gavinchou as a code owner June 1, 2026 13:01
@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

### What problem does this PR solve?

Issue Number: None

Related PR: apache#63469

Problem Summary: Merge latest apache/doris master into PR apache#63968 and resolve the beta_rowset_writer.cpp conflict by preserving the PR change that avoids copy-then-resize for segment key bounds while keeping the upstream truncation marker behavior.

### Release note

None

### Check List (For Author)

- Test: Manual test
    - git diff --check
    - build-support/check-format.sh be/src/storage/rowset/beta_rowset_writer.cpp
- Behavior changed: No
- Does this need documentation: No
@sollhui

sollhui commented Jun 1, 2026

Copy link
Copy Markdown
Contributor Author

run buildall

@liaoxin01

Copy link
Copy Markdown
Contributor

/review

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review result: no blocking issues found in this PR.

Critical checkpoint conclusions:

  • Goal and proof: The change avoids retaining full key-bound buffers in the rowset writer by assigning truncated strings directly into stored segment statistics. Existing segment key-bound truncation tests cover the functional metadata behavior, but I did not find a dedicated memory-retention regression test.
  • Scope and clarity: The modification is small and focused on the two segment-statistics insertion paths in beta_rowset_writer.cpp.
  • Concurrency: The changed code builds the stored statistics before taking _segid_statistics_map_mutex, and preserves the existing mutex protection when updating _segid_statistics_map and _segments_key_bounds_truncated. No new lock-order or heavy-under-lock issue was introduced.
  • Lifecycle/static initialization: No new static/global lifecycle dependency was introduced.
  • Configuration: No new config item was added. Existing truncation config behavior is preserved, including the random-test path falling back to RowsetMeta::set_segments_key_bounds truncation.
  • Compatibility: No storage format, thrift/protobuf schema, or function-symbol compatibility change was introduced.
  • Parallel paths: Both normal add_segment and segcompaction flush_segment_writer_for_segcompaction paths in this writer were updated. The load-stream writer path only forwards segment stats to destination replicas and does not retain them locally in the same map.
  • Conditional checks: Existing truncation threshold semantics are preserved.
  • Test coverage: No new tests were added. Existing key-bound truncation tests cover visible behavior; this optimization mainly targets retained protobuf string capacity.
  • Test results: I ran git diff --check successfully. build-support/check-format.sh be/src/storage/rowset/beta_rowset_writer.cpp could not run in this runner because clang-format 16 is unavailable.
  • Observability: No new observability is needed for this narrow memory optimization.
  • Transactions/persistence/data writes: The persisted key-bound values and truncation marker semantics remain unchanged; no transaction or delete-bitmap path behavior appears changed.
  • FE/BE variable passing: Not applicable.
  • Performance: The change removes the full-copy-then-resize pattern on the retained statistics path and keeps the existing rowset-meta truncation behavior, so it addresses the intended memory-retention issue without adding hot-path scans.

User focus: no additional user-provided review focus was present.

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 29446 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 0e7ed452e51f0e7b8be249db4cf6b5d120c40d60, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17673	4091	4103	4091
q2	q3	10770	1400	814	814
q4	4683	484	337	337
q5	7539	876	579	579
q6	186	176	144	144
q7	775	859	631	631
q8	9391	1558	1618	1558
q9	6370	4494	4431	4431
q10	6827	1802	1501	1501
q11	439	280	247	247
q12	645	421	314	314
q13	18144	3503	2807	2807
q14	266	260	243	243
q15	q16	834	798	711	711
q17	973	975	995	975
q18	7048	5869	5542	5542
q19	1213	1353	1160	1160
q20	519	400	259	259
q21	6230	2908	2774	2774
q22	453	382	328	328
Total cold run time: 100978 ms
Total hot run time: 29446 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	5061	4949	4837	4837
q2	q3	5050	5330	4685	4685
q4	2180	2257	1399	1399
q5	4900	4731	4867	4731
q6	241	177	126	126
q7	1916	1796	1577	1577
q8	2559	2268	2118	2118
q9	7418	7427	7404	7404
q10	4788	4732	4258	4258
q11	543	416	372	372
q12	740	747	531	531
q13	3070	3481	2826	2826
q14	286	276	264	264
q15	q16	680	705	616	616
q17	1291	1272	1256	1256
q18	7399	6947	7065	6947
q19	1105	1097	1106	1097
q20	2248	2224	1967	1967
q21	5373	4693	4560	4560
q22	539	475	406	406
Total cold run time: 57387 ms
Total hot run time: 51977 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 170723 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 0e7ed452e51f0e7b8be249db4cf6b5d120c40d60, data reload: false

query5	4315	656	516	516
query6	334	227	200	200
query7	4292	563	322	322
query8	319	233	219	219
query9	8807	4018	4006	4006
query10	478	338	295	295
query11	5804	2332	2189	2189
query12	185	127	126	126
query13	1340	623	451	451
query14	6121	5445	5129	5129
query14_1	4466	4441	4447	4441
query15	210	199	179	179
query16	996	458	420	420
query17	1134	719	586	586
query18	2595	486	348	348
query19	219	203	176	176
query20	141	134	131	131
query21	216	138	115	115
query22	13749	13495	13405	13405
query23	17327	16544	16184	16184
query23_1	16295	16343	16185	16185
query24	7636	1766	1313	1313
query24_1	1323	1301	1313	1301
query25	571	483	433	433
query26	1330	326	171	171
query27	2688	571	346	346
query28	4422	2058	2037	2037
query29	1006	644	528	528
query30	311	241	200	200
query31	1132	1080	947	947
query32	91	78	72	72
query33	540	342	285	285
query34	1178	1141	647	647
query35	801	796	705	705
query36	1383	1449	1329	1329
query37	156	110	93	93
query38	3231	3167	3073	3073
query39	917	929	911	911
query39_1	879	884	886	884
query40	232	144	124	124
query41	65	63	61	61
query42	114	107	110	107
query43	325	333	294	294
query44	
query45	216	205	198	198
query46	1072	1170	766	766
query47	2345	2383	2235	2235
query48	412	397	293	293
query49	629	510	384	384
query50	971	347	263	263
query51	4373	4303	4240	4240
query52	107	105	93	93
query53	260	276	200	200
query54	308	269	266	266
query55	98	91	83	83
query56	297	322	301	301
query57	1428	1408	1328	1328
query58	304	278	268	268
query59	1594	1653	1447	1447
query60	326	331	309	309
query61	158	149	156	149
query62	709	641	589	589
query63	252	207	217	207
query64	2472	837	713	713
query65	
query66	1756	515	373	373
query67	29905	29760	29488	29488
query68	
query69	467	348	311	311
query70	1063	1021	987	987
query71	318	286	268	268
query72	3082	2836	2515	2515
query73	869	767	399	399
query74	5095	4940	4747	4747
query75	2649	2599	2268	2268
query76	2308	1131	783	783
query77	406	406	329	329
query78	12419	12573	11924	11924
query79	1445	1061	770	770
query80	1349	516	461	461
query81	509	283	240	240
query82	1317	153	123	123
query83	346	271	244	244
query84	256	143	112	112
query85	926	532	442	442
query86	442	357	313	313
query87	3429	3419	3194	3194
query88	3625	2760	2742	2742
query89	459	401	353	353
query90	1800	181	175	175
query91	174	162	148	148
query92	76	76	76	76
query93	1506	1575	834	834
query94	662	347	288	288
query95	710	371	446	371
query96	1054	833	341	341
query97	2713	2691	2610	2610
query98	239	230	226	226
query99	1160	1169	1026	1026
Total cold run time: 255687 ms
Total hot run time: 170723 ms

@hello-stephen

Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 96.15% (25/26) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 73.90% (28169/38116)
Line Coverage 57.87% (306160/529077)
Region Coverage 55.08% (256482/465682)
Branch Coverage 56.54% (110663/195713)

@liaoxin01 liaoxin01 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label Jun 3, 2026
@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions

github-actions Bot commented Jun 3, 2026

Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.7-merged dev/4.1.2-merged reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants