Skip to content

[fix](be) Protect tablet writer map lookup in load channel#64604

Merged
sollhui merged 5 commits into
apache:masterfrom
0AyanamiRei:fix/tablets-channel-writer-lock
Jun 23, 2026
Merged

[fix](be) Protect tablet writer map lookup in load channel#64604
sollhui merged 5 commits into
apache:masterfrom
0AyanamiRei:fix/tablets-channel-writer-lock

Conversation

@0AyanamiRei

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: None

Related PR: #57133

Problem Summary:

BaseTabletsChannel::_write_block_data can run concurrently with incremental_open for the same tablets channel. _tablet_writers is an std::unordered_map protected by _tablet_writers_lock when writers are inserted, but the tablet load rowset info lookup read the map without holding the lock.

A concurrent emplace may rehash _tablet_writers, so the unlocked lookup can race with bucket reallocation. This patch protects the lookup with _tablet_writers_lock and avoids using unordered_map iterators after the lock is released. The actual writer operations still run outside _tablet_writers_lock, so the lock remains scoped to the map access.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
      • ./run-be-ut.sh --run --filter=LoadStreamMapPoolTest.*
    • Manual test (add detailed scripts or steps below)
      • ./build-support/check-format.sh
      • git diff --check
      • Attempted ./build-support/run-clang-tidy.sh --build-dir be/build_Debug; local analysis failed before checking the code because clang-tidy reported malformed or corrupted AST file: 'malformed block record in AST file' from the local compile database/PCH.
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

### What problem does this PR solve?

Issue Number: None

Related PR: apache#57133

Problem Summary: In BaseTabletsChannel::_write_block_data, the add_batch path may run concurrently with incremental_open for the same tablets channel. The writer map is an std::unordered_map protected by _tablet_writers_lock when writers are inserted, but one lookup added for tablet load rowset info read the map without the lock. Concurrent emplace may rehash the map and invalidate the buckets being read. This patch protects that lookup with _tablet_writers_lock and avoids using unordered_map iterators after the lock is released.

### Release note

None

### Check List (For Author)

- Test:
    - Unit Test: ./run-be-ut.sh --run --filter=LoadStreamMapPoolTest.*
    - Manual test: ./build-support/check-format.sh
    - Manual test: git diff --check
    - Manual test: ./build-support/run-clang-tidy.sh --build-dir be/build_Debug (failed to analyze because the local compile database/PCH reported malformed or corrupted AST file)
- Behavior changed: No
- Does this need documentation: No
@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@0AyanamiRei

Copy link
Copy Markdown
Contributor Author

/review

@0AyanamiRei

Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 29572 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 64cfc348938fa921cce5342e3c4e1fc4fb655029, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17735	4054	4060	4054
q2	2058	308	191	191
q3	10333	1475	821	821
q4	4686	471	360	360
q5	7561	854	569	569
q6	183	170	134	134
q7	758	854	623	623
q8	9336	1592	1614	1592
q9	5962	4539	4554	4539
q10	6763	1778	1540	1540
q11	430	272	239	239
q12	627	412	297	297
q13	18131	3481	2812	2812
q14	276	270	247	247
q15	q16	791	777	716	716
q17	1067	1065	977	977
q18	6978	5778	5659	5659
q19	1348	1341	1094	1094
q20	495	405	265	265
q21	5926	2597	2547	2547
q22	429	361	296	296
Total cold run time: 101873 ms
Total hot run time: 29572 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4344	4277	4239	4239
q2	339	350	235	235
q3	4581	4951	4466	4466
q4	2079	2160	1353	1353
q5	4450	4308	4318	4308
q6	230	179	126	126
q7	1747	1857	1818	1818
q8	2627	2263	2176	2176
q9	8153	8418	8051	8051
q10	4794	4750	4259	4259
q11	589	414	399	399
q12	749	768	534	534
q13	3375	3627	3037	3037
q14	292	300	275	275
q15	q16	700	722	633	633
q17	1372	1321	1462	1321
q18	7849	7361	7176	7176
q19	1188	1130	1111	1111
q20	2203	2233	1944	1944
q21	5274	4586	4470	4470
q22	536	447	410	410
Total cold run time: 57471 ms
Total hot run time: 52341 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 174975 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 64cfc348938fa921cce5342e3c4e1fc4fb655029, data reload: false

query5	4341	643	469	469
query6	452	194	178	178
query7	4814	591	290	290
query8	379	219	215	215
query9	8779	4045	4042	4042
query10	548	308	257	257
query11	5846	2400	2154	2154
query12	162	104	102	102
query13	1266	603	428	428
query14	6404	5435	5066	5066
query14_1	4448	4410	4393	4393
query15	207	197	177	177
query16	1038	487	444	444
query17	1134	716	583	583
query18	2562	482	354	354
query19	211	189	148	148
query20	118	107	107	107
query21	224	141	118	118
query22	13841	13582	13473	13473
query23	17478	16553	16161	16161
query23_1	16409	16409	16264	16264
query24	7592	1810	1295	1295
query24_1	1326	1282	1346	1282
query25	562	470	390	390
query26	1320	324	166	166
query27	2627	522	341	341
query28	4450	2063	2075	2063
query29	1106	649	506	506
query30	308	235	200	200
query31	1113	1074	962	962
query32	113	65	59	59
query33	557	313	262	262
query34	1203	1160	655	655
query35	787	769	666	666
query36	1382	1419	1213	1213
query37	146	109	88	88
query38	3191	3126	3039	3039
query39	937	928	888	888
query39_1	878	868	872	868
query40	235	123	99	99
query41	61	61	59	59
query42	95	91	94	91
query43	319	322	287	287
query44	1432	780	776	776
query45	199	200	176	176
query46	1077	1195	748	748
query47	2369	2330	2246	2246
query48	418	409	284	284
query49	628	466	354	354
query50	1015	352	263	263
query51	4374	4391	4190	4190
query52	88	88	77	77
query53	242	281	184	184
query54	266	213	190	190
query55	80	73	69	69
query56	226	223	216	216
query57	1444	1406	1287	1287
query58	231	216	203	203
query59	1589	1659	1447	1447
query60	285	243	225	225
query61	152	144	145	144
query62	695	654	589	589
query63	229	183	193	183
query64	2553	772	631	631
query65	4871	4779	4785	4779
query66	1803	452	332	332
query67	29819	29661	28937	28937
query68	3076	1651	905	905
query69	430	314	269	269
query70	1102	978	900	900
query71	286	234	219	219
query72	3117	2683	2355	2355
query73	853	764	432	432
query74	5158	5007	4768	4768
query75	2632	2602	2237	2237
query76	2342	1166	795	795
query77	368	386	280	280
query78	12497	12504	11751	11751
query79	1477	1200	775	775
query80	1276	471	406	406
query81	523	282	238	238
query82	640	156	118	118
query83	323	280	239	239
query84	277	145	115	115
query85	976	523	426	426
query86	464	301	268	268
query87	3379	3360	3211	3211
query88	3716	2822	2783	2783
query89	430	379	334	334
query90	1931	182	172	172
query91	167	174	135	135
query92	62	63	58	58
query93	1636	1590	882	882
query94	759	364	294	294
query95	678	472	347	347
query96	1069	804	358	358
query97	2701	2715	2540	2540
query98	214	208	202	202
query99	1181	1136	1030	1030
Total cold run time: 263231 ms
Total hot run time: 174975 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
ClickBench: Total hot run time: 25.28 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 64cfc348938fa921cce5342e3c4e1fc4fb655029, data reload: false

query1	0.01	0.01	0.01
query2	0.10	0.04	0.05
query3	0.26	0.14	0.14
query4	1.62	0.14	0.14
query5	0.25	0.23	0.22
query6	1.23	1.03	1.12
query7	0.04	0.01	0.00
query8	0.06	0.04	0.03
query9	0.38	0.31	0.31
query10	0.58	0.56	0.55
query11	0.19	0.14	0.14
query12	0.18	0.15	0.14
query13	0.48	0.47	0.47
query14	1.04	1.01	1.01
query15	0.61	0.60	0.60
query16	0.32	0.33	0.31
query17	1.09	1.11	1.13
query18	0.23	0.22	0.21
query19	2.05	1.99	1.94
query20	0.02	0.01	0.01
query21	15.43	0.23	0.14
query22	4.88	0.05	0.05
query23	16.09	0.32	0.11
query24	2.96	0.43	0.31
query25	0.12	0.04	0.08
query26	0.80	0.20	0.16
query27	0.06	0.04	0.03
query28	3.51	0.91	0.56
query29	12.61	4.39	3.53
query30	0.27	0.14	0.16
query31	2.77	0.60	0.32
query32	3.22	0.59	0.48
query33	3.26	3.27	3.20
query34	15.54	4.22	3.52
query35	3.54	3.50	3.57
query36	0.54	0.43	0.42
query37	0.09	0.06	0.06
query38	0.06	0.04	0.04
query39	0.03	0.03	0.03
query40	0.17	0.15	0.14
query41	0.09	0.04	0.03
query42	0.05	0.03	0.03
query43	0.05	0.04	0.03
Total cold run time: 96.88 s
Total hot run time: 25.28 s

@0AyanamiRei

Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen

Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 86.67% (13/15) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 74.11% (28361/38267)
Line Coverage 58.07% (309293/532630)
Region Coverage 54.85% (258890/471985)
Branch Coverage 56.23% (112392/199895)

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 29036 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 89d5ff0572c79db50f8d4c38c046b87b9e3a00a4, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17620	4002	3939	3939
q2	1981	312	182	182
q3	10382	1383	808	808
q4	4675	468	344	344
q5	7516	863	570	570
q6	190	178	137	137
q7	762	819	639	639
q8	9431	1626	1635	1626
q9	6406	4544	4521	4521
q10	6828	1794	1506	1506
q11	428	272	241	241
q12	626	427	293	293
q13	18131	3376	2779	2779
q14	269	263	239	239
q15	q16	797	782	707	707
q17	1022	978	869	869
q18	6911	5936	5600	5600
q19	1313	1254	1024	1024
q20	495	402	265	265
q21	5925	2559	2446	2446
q22	448	363	301	301
Total cold run time: 102156 ms
Total hot run time: 29036 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4337	4286	4275	4275
q2	324	363	223	223
q3	4624	4995	4385	4385
q4	2073	2160	1376	1376
q5	4404	4323	4323	4323
q6	224	179	126	126
q7	1684	1692	1836	1692
q8	2507	2238	2127	2127
q9	8014	7959	7959	7959
q10	4802	4760	4246	4246
q11	588	407	381	381
q12	744	751	532	532
q13	3482	3614	3042	3042
q14	298	301	264	264
q15	q16	726	745	648	648
q17	1371	1335	1341	1335
q18	8081	7500	7029	7029
q19	1135	1105	1102	1102
q20	2225	2212	1950	1950
q21	5275	4659	4542	4542
q22	530	458	405	405
Total cold run time: 57448 ms
Total hot run time: 51962 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 175059 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 89d5ff0572c79db50f8d4c38c046b87b9e3a00a4, data reload: false

query5	4325	634	486	486
query6	425	190	177	177
query7	4833	525	298	298
query8	360	214	207	207
query9	8747	4089	4102	4089
query10	442	302	256	256
query11	5956	2334	2203	2203
query12	162	102	97	97
query13	1262	620	410	410
query14	6349	5388	5087	5087
query14_1	4412	4402	4402	4402
query15	210	202	179	179
query16	999	468	399	399
query17	1149	716	581	581
query18	2440	469	354	354
query19	207	186	149	149
query20	113	103	107	103
query21	213	135	116	116
query22	13612	13530	13317	13317
query23	17148	16633	16215	16215
query23_1	16299	16246	16287	16246
query24	7507	1744	1325	1325
query24_1	1338	1280	1318	1280
query25	527	435	370	370
query26	1311	332	165	165
query27	3083	562	350	350
query28	4497	2100	2022	2022
query29	1087	596	492	492
query30	313	235	196	196
query31	1093	1070	955	955
query32	105	59	56	56
query33	514	307	240	240
query34	1175	1153	646	646
query35	742	773	676	676
query36	1385	1458	1202	1202
query37	155	108	90	90
query38	3208	3135	3045	3045
query39	936	921	895	895
query39_1	878	875	861	861
query40	214	120	98	98
query41	66	62	61	61
query42	96	93	95	93
query43	315	325	276	276
query44	1414	759	776	759
query45	194	187	174	174
query46	1061	1216	752	752
query47	2426	2421	2248	2248
query48	412	409	306	306
query49	613	474	354	354
query50	1051	346	262	262
query51	4284	4381	4209	4209
query52	90	88	78	78
query53	253	273	193	193
query54	271	211	192	192
query55	79	77	70	70
query56	226	211	213	211
query57	1424	1433	1330	1330
query58	234	217	200	200
query59	1580	1681	1495	1495
query60	277	240	219	219
query61	183	146	143	143
query62	703	647	577	577
query63	235	190	199	190
query64	2516	771	609	609
query65	4869	4793	4742	4742
query66	1782	465	332	332
query67	29880	29885	29475	29475
query68	3178	1550	967	967
query69	410	302	258	258
query70	1061	965	951	951
query71	285	227	206	206
query72	2970	2681	2345	2345
query73	823	784	455	455
query74	5081	4933	4743	4743
query75	2626	2592	2230	2230
query76	2320	1176	802	802
query77	354	369	281	281
query78	12421	12435	11905	11905
query79	1202	1167	707	707
query80	531	482	383	383
query81	446	273	235	235
query82	236	157	118	118
query83	265	276	250	250
query84	259	143	115	115
query85	808	529	417	417
query86	325	317	298	298
query87	3358	3333	3116	3116
query88	3635	2783	2776	2776
query89	404	386	329	329
query90	2171	192	182	182
query91	167	156	127	127
query92	63	60	58	58
query93	1533	1425	935	935
query94	525	361	299	299
query95	674	391	342	342
query96	1124	800	372	372
query97	2728	2724	2582	2582
query98	217	202	196	196
query99	1150	1163	1032	1032
Total cold run time: 260098 ms
Total hot run time: 175059 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
ClickBench: Total hot run time: 25.35 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 89d5ff0572c79db50f8d4c38c046b87b9e3a00a4, data reload: false

query1	0.01	0.00	0.01
query2	0.09	0.04	0.05
query3	0.25	0.13	0.13
query4	1.61	0.15	0.14
query5	0.24	0.23	0.22
query6	1.26	1.13	1.08
query7	0.04	0.00	0.00
query8	0.05	0.04	0.04
query9	0.37	0.32	0.31
query10	0.55	0.54	0.54
query11	0.20	0.15	0.15
query12	0.19	0.15	0.15
query13	0.47	0.48	0.47
query14	1.00	1.01	0.99
query15	0.61	0.59	0.58
query16	0.31	0.31	0.31
query17	1.07	1.10	1.11
query18	0.23	0.21	0.21
query19	2.03	2.00	1.99
query20	0.02	0.01	0.02
query21	15.42	0.18	0.14
query22	4.96	0.06	0.06
query23	16.13	0.31	0.12
query24	3.04	0.42	0.33
query25	0.12	0.05	0.03
query26	0.73	0.20	0.16
query27	0.04	0.03	0.03
query28	3.58	0.92	0.54
query29	12.48	4.24	3.45
query30	0.27	0.16	0.15
query31	2.78	0.57	0.33
query32	3.22	0.60	0.50
query33	3.18	3.24	3.26
query34	15.47	4.24	3.51
query35	3.53	3.51	3.52
query36	0.55	0.44	0.42
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.04	0.03	0.03
query40	0.17	0.16	0.16
query41	0.09	0.04	0.03
query42	0.04	0.02	0.03
query43	0.04	0.04	0.03
Total cold run time: 96.62 s
Total hot run time: 25.35 s

@0AyanamiRei

Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen

Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/15) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 54.42% (21362/39251)
Line Coverage 38.06% (204306/536813)
Region Coverage 34.06% (160349/470723)
Branch Coverage 35.05% (70172/200193)

@0AyanamiRei

Copy link
Copy Markdown
Contributor Author

run buildall

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 29444 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 1e5c1bb7db3eeefcd66806f861ef701c978a7eef, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17790	4035	4033	4033
q2	1997	325	186	186
q3	10315	1410	810	810
q4	4683	470	342	342
q5	7495	896	581	581
q6	178	166	134	134
q7	778	833	629	629
q8	9329	1548	1543	1543
q9	5766	4515	4527	4515
q10	6744	1806	1552	1552
q11	449	280	247	247
q12	628	411	288	288
q13	18147	3458	2864	2864
q14	263	263	233	233
q15	q16	782	776	709	709
q17	987	927	959	927
q18	7260	5755	5631	5631
q19	1308	1334	1101	1101
q20	486	403	271	271
q21	5961	2628	2546	2546
q22	448	360	302	302
Total cold run time: 101794 ms
Total hot run time: 29444 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4347	4283	4260	4260
q2	337	384	229	229
q3	4578	5023	4406	4406
q4	2065	2177	1392	1392
q5	4462	4275	4293	4275
q6	231	175	130	130
q7	1748	1842	1926	1842
q8	2624	2191	2411	2191
q9	8138	8392	8042	8042
q10	4778	4780	4323	4323
q11	596	445	401	401
q12	739	797	546	546
q13	3319	3602	2997	2997
q14	287	300	269	269
q15	q16	717	733	664	664
q17	1355	1333	1358	1333
q18	7846	7290	7399	7290
q19	1171	1151	1167	1151
q20	2190	2216	1953	1953
q21	5323	4640	4438	4438
q22	522	461	395	395
Total cold run time: 57373 ms
Total hot run time: 52527 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 176635 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 1e5c1bb7db3eeefcd66806f861ef701c978a7eef, data reload: false

query5	4304	630	491	491
query6	437	188	174	174
query7	4808	546	314	314
query8	356	214	193	193
query9	8762	4149	4112	4112
query10	440	306	266	266
query11	5907	2345	2132	2132
query12	161	102	99	99
query13	1243	625	438	438
query14	6416	5417	5115	5115
query14_1	4428	4366	4402	4366
query15	206	207	178	178
query16	1036	515	456	456
query17	1134	720	570	570
query18	2745	489	364	364
query19	207	187	151	151
query20	113	107	104	104
query21	218	145	121	121
query22	13702	13517	13334	13334
query23	17431	16522	16219	16219
query23_1	16357	16297	16360	16297
query24	7468	1788	1318	1318
query24_1	1347	1307	1335	1307
query25	595	461	415	415
query26	1321	326	179	179
query27	2580	518	344	344
query28	4380	2051	2007	2007
query29	1104	632	495	495
query30	314	248	197	197
query31	1121	1083	960	960
query32	100	62	60	60
query33	525	315	289	289
query34	1169	1149	646	646
query35	743	804	677	677
query36	1428	1402	1252	1252
query37	151	102	88	88
query38	3176	3143	3058	3058
query39	932	938	897	897
query39_1	881	879	892	879
query40	216	119	98	98
query41	64	64	61	61
query42	94	94	96	94
query43	335	332	280	280
query44	1402	768	789	768
query45	196	185	175	175
query46	1096	1211	752	752
query47	2405	2347	2272	2272
query48	413	401	303	303
query49	611	458	345	345
query50	1051	356	257	257
query51	4301	4461	4240	4240
query52	89	91	78	78
query53	251	271	197	197
query54	266	222	196	196
query55	78	75	69	69
query56	225	226	225	225
query57	1435	1393	1293	1293
query58	232	214	210	210
query59	1568	1643	1482	1482
query60	301	235	238	235
query61	157	151	147	147
query62	706	649	586	586
query63	231	189	203	189
query64	2494	806	619	619
query65	4817	4771	4803	4771
query66	1721	449	333	333
query67	29700	29608	29464	29464
query68	3251	1549	916	916
query69	410	305	260	260
query70	1045	981	965	965
query71	300	234	214	214
query72	2995	2659	2431	2431
query73	858	777	460	460
query74	5088	4987	4746	4746
query75	2626	2607	2240	2240
query76	2323	1204	807	807
query77	362	381	287	287
query78	12423	12425	11849	11849
query79	1349	1233	780	780
query80	662	478	379	379
query81	464	280	239	239
query82	557	158	121	121
query83	338	279	256	256
query84	270	142	113	113
query85	865	517	407	407
query86	378	302	289	289
query87	3402	3345	3229	3229
query88	3665	2769	2756	2756
query89	416	380	328	328
query90	1953	181	180	180
query91	172	169	134	134
query92	64	64	55	55
query93	1505	1419	863	863
query94	563	336	348	336
query95	673	375	339	339
query96	1085	813	379	379
query97	2700	2713	2557	2557
query98	224	212	199	199
query99	1154	1196	1031	1031
Total cold run time: 260590 ms
Total hot run time: 176635 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
ClickBench: Total hot run time: 25.23 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 1e5c1bb7db3eeefcd66806f861ef701c978a7eef, data reload: false

query1	0.01	0.00	0.00
query2	0.09	0.06	0.05
query3	0.25	0.14	0.14
query4	1.61	0.14	0.14
query5	0.24	0.23	0.21
query6	1.25	1.11	1.06
query7	0.04	0.01	0.01
query8	0.05	0.04	0.04
query9	0.38	0.32	0.31
query10	0.59	0.54	0.54
query11	0.21	0.14	0.14
query12	0.20	0.15	0.14
query13	0.47	0.46	0.48
query14	1.03	1.00	1.00
query15	0.61	0.60	0.59
query16	0.32	0.32	0.33
query17	1.10	1.05	1.06
query18	0.22	0.21	0.22
query19	2.07	1.88	1.94
query20	0.01	0.01	0.02
query21	15.45	0.22	0.13
query22	4.84	0.05	0.06
query23	16.13	0.30	0.12
query24	3.01	0.42	0.34
query25	0.12	0.05	0.05
query26	0.73	0.21	0.15
query27	0.04	0.04	0.04
query28	3.55	0.90	0.54
query29	12.52	4.32	3.47
query30	0.28	0.15	0.17
query31	2.77	0.59	0.32
query32	3.21	0.60	0.49
query33	3.25	3.20	3.19
query34	15.56	4.20	3.58
query35	3.56	3.51	3.53
query36	0.55	0.46	0.44
query37	0.08	0.07	0.06
query38	0.05	0.04	0.04
query39	0.04	0.03	0.03
query40	0.18	0.17	0.15
query41	0.09	0.04	0.03
query42	0.04	0.03	0.02
query43	0.04	0.04	0.03
Total cold run time: 96.84 s
Total hot run time: 25.23 s

@hello-stephen

Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/15) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 54.42% (21361/39252)
Line Coverage 38.02% (204129/536829)
Region Coverage 34.03% (160171/470729)
Branch Coverage 35.04% (70152/200199)

@hello-stephen

Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 86.67% (13/15) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 74.19% (28435/38326)
Line Coverage 58.09% (309993/533677)
Region Coverage 54.75% (258878/472872)
Branch Coverage 56.14% (112542/200454)

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 29298 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 4110b3cde6f29da6be25d8d2be197c079e1a84d9, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17634	4071	4031	4031
q2	2001	311	195	195
q3	10339	1482	824	824
q4	4676	463	334	334
q5	7546	857	583	583
q6	183	172	136	136
q7	767	841	624	624
q8	9311	1614	1557	1557
q9	5778	4511	4536	4511
q10	6794	1818	1564	1564
q11	428	276	251	251
q12	632	423	287	287
q13	18123	3451	2737	2737
q14	268	266	239	239
q15	q16	780	781	707	707
q17	980	975	1056	975
q18	7196	5757	5703	5703
q19	1312	1366	1054	1054
q20	530	410	272	272
q21	5948	2611	2414	2414
q22	425	361	300	300
Total cold run time: 101651 ms
Total hot run time: 29298 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4384	4281	4295	4281
q2	335	346	236	236
q3	4608	4980	4418	4418
q4	2166	2166	1386	1386
q5	4468	4288	4345	4288
q6	241	178	128	128
q7	1729	1766	1958	1766
q8	2616	2185	2173	2173
q9	8186	8287	8064	8064
q10	4856	4774	4256	4256
q11	579	420	394	394
q12	738	767	554	554
q13	3306	3628	3043	3043
q14	303	306	281	281
q15	q16	723	733	651	651
q17	1331	1319	1303	1303
q18	7873	7355	7200	7200
q19	1180	1160	1105	1105
q20	2212	2217	1933	1933
q21	5284	4579	4462	4462
q22	525	472	388	388
Total cold run time: 57643 ms
Total hot run time: 52310 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 175810 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 4110b3cde6f29da6be25d8d2be197c079e1a84d9, data reload: false

query5	4327	628	470	470
query6	449	190	191	190
query7	4879	567	306	306
query8	360	214	204	204
query9	8735	4022	4033	4022
query10	443	319	257	257
query11	5900	2385	2143	2143
query12	155	105	104	104
query13	1288	624	426	426
query14	6409	5404	5078	5078
query14_1	4385	4387	4370	4370
query15	204	210	183	183
query16	1010	456	443	443
query17	947	710	589	589
query18	2508	487	350	350
query19	208	188	149	149
query20	112	115	113	113
query21	218	135	117	117
query22	13594	13563	13385	13385
query23	17452	16587	16112	16112
query23_1	16303	16422	16257	16257
query24	7501	1777	1312	1312
query24_1	1347	1317	1339	1317
query25	570	462	402	402
query26	1313	297	172	172
query27	2724	592	344	344
query28	4462	2041	2058	2041
query29	1113	634	503	503
query30	311	241	201	201
query31	1122	1081	962	962
query32	119	66	63	63
query33	534	321	292	292
query34	1172	1214	635	635
query35	741	779	679	679
query36	1404	1359	1259	1259
query37	153	103	87	87
query38	3195	3112	3060	3060
query39	931	915	914	914
query39_1	890	873	889	873
query40	219	120	95	95
query41	62	59	59	59
query42	94	93	95	93
query43	321	317	272	272
query44	1418	767	768	767
query45	189	195	178	178
query46	1141	1173	725	725
query47	2325	2329	2235	2235
query48	403	397	302	302
query49	622	457	351	351
query50	1055	355	257	257
query51	4274	4289	4257	4257
query52	86	86	76	76
query53	251	271	199	199
query54	283	223	199	199
query55	79	72	70	70
query56	242	224	214	214
query57	1418	1440	1317	1317
query58	239	207	215	207
query59	1553	1640	1437	1437
query60	278	254	215	215
query61	156	151	155	151
query62	685	655	561	561
query63	230	193	196	193
query64	2525	767	642	642
query65	4785	4800	4833	4800
query66	1822	468	338	338
query67	29737	29722	29571	29571
query68	3314	1614	933	933
query69	419	297	270	270
query70	1104	1014	918	918
query71	296	230	208	208
query72	2899	2632	2363	2363
query73	846	737	441	441
query74	5173	4985	4755	4755
query75	2618	2605	2229	2229
query76	2339	1193	797	797
query77	354	381	283	283
query78	12385	12560	11972	11972
query79	1367	1181	792	792
query80	1277	471	378	378
query81	523	281	235	235
query82	603	156	118	118
query83	357	273	250	250
query84	313	144	116	116
query85	910	520	412	412
query86	419	315	283	283
query87	3381	3336	3190	3190
query88	3673	2811	2740	2740
query89	428	383	337	337
query90	1913	185	178	178
query91	174	164	134	134
query92	60	62	56	56
query93	1574	1449	919	919
query94	729	322	269	269
query95	693	386	340	340
query96	1048	860	341	341
query97	2743	2719	2570	2570
query98	213	203	202	202
query99	1184	1182	1067	1067
Total cold run time: 261681 ms
Total hot run time: 175810 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
ClickBench: Total hot run time: 25.25 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 4110b3cde6f29da6be25d8d2be197c079e1a84d9, data reload: false

query1	0.00	0.00	0.01
query2	0.09	0.05	0.04
query3	0.25	0.14	0.14
query4	1.60	0.14	0.14
query5	0.25	0.23	0.21
query6	1.24	1.06	1.05
query7	0.03	0.01	0.00
query8	0.06	0.06	0.03
query9	0.38	0.31	0.32
query10	0.54	0.57	0.59
query11	0.19	0.14	0.14
query12	0.18	0.15	0.14
query13	0.47	0.48	0.47
query14	1.03	0.99	1.00
query15	0.62	0.58	0.61
query16	0.32	0.31	0.33
query17	1.10	1.09	1.13
query18	0.22	0.22	0.20
query19	2.01	1.96	1.90
query20	0.02	0.01	0.02
query21	15.44	0.20	0.13
query22	4.96	0.06	0.06
query23	16.22	0.31	0.12
query24	2.94	0.43	0.31
query25	0.12	0.05	0.04
query26	0.72	0.19	0.16
query27	0.05	0.03	0.04
query28	3.56	0.93	0.57
query29	12.48	4.29	3.47
query30	0.27	0.16	0.15
query31	2.77	0.60	0.31
query32	3.22	0.60	0.50
query33	3.26	3.21	3.26
query34	15.67	4.22	3.53
query35	3.56	3.57	3.53
query36	0.54	0.44	0.43
query37	0.09	0.06	0.07
query38	0.05	0.04	0.04
query39	0.04	0.03	0.03
query40	0.18	0.16	0.16
query41	0.08	0.03	0.03
query42	0.04	0.03	0.03
query43	0.04	0.03	0.03
Total cold run time: 96.9 s
Total hot run time: 25.25 s

@sollhui

sollhui commented Jun 22, 2026

Copy link
Copy Markdown
Contributor

/review

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one issue: this concurrency fix does not add or cite targeted coverage for the add_batch/incremental_open race it fixes.

Critical checkpoint conclusions:

  • Goal/code: the implementation matches the stated goal by protecting _tablet_writers lookups and avoiding use of unordered_map iterators after releasing _tablet_writers_lock.
  • Scope: the change is small and focused in be/src/load/channel/tablets_channel.cpp.
  • Concurrency: the changed map accesses are now locked, writer operations remain outside the map lock, and I did not find a new lock-order or writer-lifetime regression. The raw writer pointer remains stable across unordered_map rehash because the map stores unique_ptr values and entries are not erased during channel lifetime.
  • Lifecycle/config/compatibility: no new config, protocol, persistence, or static lifecycle behavior is introduced.
  • Parallel paths: the shared _write_block_data path is used by local and cloud tablets channels; no optimizer/rewrite path is touched.
  • Tests: there is no targeted test for the fixed race, and the cited LoadStreamMapPoolTest.* does not exercise this code path. This is the inline comment below.
  • Style/CI: git diff --check is clean locally; GitHub reports formatter, compile, and BE UT success. Two regression contexts are failing, but I did not find evidence tying those failures to this one-file BE concurrency change.
  • Observability/performance: no new observability appears necessary; the added lock scope is limited to map lookup.

User focus: no additional user-provided review focus was supplied.

Subagent conclusions: optimizer-rewrite found no applicable optimizer/rewrite issue and converged with NO_NEW_VALUABLE_FINDINGS. tests-session-config proposed the missing targeted coverage candidate, which I verified and accepted as the inline comment. Convergence round 1 ended with both live subagents reporting NO_NEW_VALUABLE_FINDINGS for the same ledger/comment set.

Comment thread be/src/load/channel/tablets_channel.cpp
@0AyanamiRei

Copy link
Copy Markdown
Contributor Author

run buildall

@sollhui sollhui left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label Jun 23, 2026
@github-actions

Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-H: Total hot run time: 29519 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 538a382306a0cade616aa22f0a145ac26ef840e9, data reload: false

------ Round 1 ----------------------------------
============================================
q1	17652	3977	3958	3958
q2	2004	337	189	189
q3	10278	1471	841	841
q4	4697	478	345	345
q5	7543	866	558	558
q6	182	170	140	140
q7	808	878	650	650
q8	9326	1619	1614	1614
q9	5911	4517	4539	4517
q10	6727	1812	1539	1539
q11	440	275	243	243
q12	642	420	291	291
q13	18102	3375	2807	2807
q14	267	261	243	243
q15	q16	784	780	711	711
q17	944	975	927	927
q18	7162	5777	5771	5771
q19	1317	1287	1121	1121
q20	478	404	261	261
q21	5895	2677	2491	2491
q22	437	355	302	302
Total cold run time: 101596 ms
Total hot run time: 29519 ms

----- Round 2, with runtime_filter_mode=off -----
============================================
q1	4318	4301	4307	4301
q2	342	362	228	228
q3	4612	4992	4393	4393
q4	2081	2185	1371	1371
q5	4444	4359	4343	4343
q6	237	172	127	127
q7	1742	1704	1955	1704
q8	2634	2218	2243	2218
q9	8152	8522	8100	8100
q10	5072	4771	4354	4354
q11	596	437	396	396
q12	775	779	541	541
q13	3289	3615	2989	2989
q14	293	307	270	270
q15	q16	698	731	661	661
q17	1363	1342	1322	1322
q18	8049	7216	7174	7174
q19	1201	1128	1103	1103
q20	2226	2221	1944	1944
q21	5315	4611	4465	4465
q22	525	453	387	387
Total cold run time: 57964 ms
Total hot run time: 52391 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
TPC-DS: Total hot run time: 173500 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 538a382306a0cade616aa22f0a145ac26ef840e9, data reload: false

query5	4326	635	490	490
query6	447	190	173	173
query7	4917	542	314	314
query8	363	214	200	200
query9	8780	4121	4121	4121
query10	454	314	257	257
query11	5753	2351	2129	2129
query12	148	97	95	95
query13	1289	620	431	431
query14	6373	5398	5106	5106
query14_1	4392	4376	4387	4376
query15	202	195	175	175
query16	993	452	426	426
query17	901	690	546	546
query18	2496	465	331	331
query19	196	181	145	145
query20	112	109	109	109
query21	217	143	119	119
query22	13605	13703	13435	13435
query23	17450	16618	16296	16296
query23_1	16423	16379	16232	16232
query24	7517	1794	1328	1328
query24_1	1344	1334	1335	1334
query25	567	476	392	392
query26	1295	320	176	176
query27	2741	544	364	364
query28	4546	2045	2047	2045
query29	1075	635	494	494
query30	314	238	205	205
query31	1156	1068	945	945
query32	112	63	60	60
query33	548	335	257	257
query34	1230	1136	641	641
query35	746	779	695	695
query36	1354	1407	1193	1193
query37	173	104	90	90
query38	1895	1717	1678	1678
query39	922	919	877	877
query39_1	889	882	867	867
query40	215	125	98	98
query41	64	63	62	62
query42	87	85	87	85
query43	337	330	284	284
query44	1448	786	768	768
query45	190	182	173	173
query46	1103	1233	751	751
query47	2377	2324	2230	2230
query48	409	401	295	295
query49	620	477	349	349
query50	976	359	261	261
query51	4414	4310	4249	4249
query52	78	81	69	69
query53	255	265	185	185
query54	279	213	189	189
query55	74	68	70	68
query56	234	218	211	211
query57	1421	1423	1317	1317
query58	246	216	208	208
query59	1633	1661	1534	1534
query60	286	262	216	216
query61	152	149	154	149
query62	702	644	590	590
query63	231	203	193	193
query64	2543	765	624	624
query65	4826	4792	4805	4792
query66	1800	473	336	336
query67	29909	29790	29717	29717
query68	3163	1582	992	992
query69	416	318	264	264
query70	1056	983	972	972
query71	288	263	200	200
query72	2934	2709	2334	2334
query73	870	783	459	459
query74	5116	5000	4749	4749
query75	2637	2624	2242	2242
query76	2323	1189	805	805
query77	355	383	288	288
query78	12287	12546	11860	11860
query79	1237	1184	797	797
query80	541	478	402	402
query81	453	294	244	244
query82	303	159	133	133
query83	366	275	250	250
query84	314	159	127	127
query85	869	512	441	441
query86	384	301	293	293
query87	1835	1831	1778	1778
query88	3741	2792	2796	2792
query89	422	369	327	327
query90	1875	192	189	189
query91	178	166	142	142
query92	61	63	55	55
query93	1572	1440	831	831
query94	542	352	310	310
query95	688	379	435	379
query96	1123	896	373	373
query97	2701	2726	2589	2589
query98	209	207	200	200
query99	1194	1167	1007	1007
Total cold run time: 257591 ms
Total hot run time: 173500 ms

@hello-stephen

Copy link
Copy Markdown
Contributor
ClickBench: Total hot run time: 25.33 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 538a382306a0cade616aa22f0a145ac26ef840e9, data reload: false

query1	0.00	0.00	0.00
query2	0.10	0.05	0.05
query3	0.26	0.14	0.14
query4	1.61	0.15	0.14
query5	0.24	0.22	0.25
query6	1.22	1.07	1.12
query7	0.04	0.01	0.01
query8	0.10	0.04	0.04
query9	0.41	0.33	0.31
query10	0.56	0.56	0.54
query11	0.20	0.14	0.15
query12	0.18	0.15	0.14
query13	0.48	0.47	0.47
query14	1.02	1.01	1.02
query15	0.61	0.60	0.60
query16	0.30	0.32	0.33
query17	1.19	1.16	1.17
query18	0.23	0.23	0.22
query19	2.02	1.93	1.97
query20	0.02	0.01	0.01
query21	15.43	0.22	0.13
query22	4.90	0.06	0.05
query23	16.14	0.31	0.12
query24	2.94	0.44	0.34
query25	0.11	0.06	0.03
query26	0.72	0.21	0.15
query27	0.04	0.03	0.04
query28	3.50	0.88	0.52
query29	12.49	4.41	3.48
query30	0.27	0.15	0.17
query31	2.76	0.63	0.32
query32	3.22	0.60	0.49
query33	3.18	3.13	3.21
query34	15.71	4.28	3.51
query35	3.56	3.56	3.59
query36	0.56	0.44	0.43
query37	0.08	0.06	0.06
query38	0.06	0.04	0.04
query39	0.04	0.03	0.04
query40	0.17	0.16	0.15
query41	0.08	0.03	0.03
query42	0.04	0.02	0.02
query43	0.04	0.04	0.04
Total cold run time: 96.83 s
Total hot run time: 25.33 s

@hello-stephen

Copy link
Copy Markdown
Contributor

BE UT Coverage Report

Increment line coverage 0.00% (0/15) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 54.57% (21418/39252)
Line Coverage 38.05% (204263/536779)
Region Coverage 34.06% (160324/470737)
Branch Coverage 35.05% (70171/200216)

@hello-stephen

Copy link
Copy Markdown
Contributor

BE Regression && UT Coverage Report

Increment line coverage 86.67% (13/15) 🎉

Increment coverage report
Complete coverage report

Category Coverage
Function Coverage 74.20% (28433/38319)
Line Coverage 58.03% (309677/533616)
Region Coverage 54.83% (259263/472865)
Branch Coverage 56.16% (112574/200467)

@liaoxin01

Copy link
Copy Markdown
Contributor

/review

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the BaseTabletsChannel::_write_block_data lock fix and the surrounding load-channel paths.

I did not find additional correctness issues beyond the existing inline thread about missing targeted coverage for the concurrent incremental_open path. The failed macOS BE UT check appears to be a runner setup issue (JAVA version is 25, it must be JDK-17) rather than a PR code failure; the main compile and BE UT status contexts are green.

Critical checkpoint conclusions:

  • Goal and behavior: the goal is clear, and the code addresses the std::unordered_map rehash/iterator race by copying the writer pointer while holding _tablet_writers_lock.
  • Scope: the modification is small and focused on the unsafe _tablet_writers lookups.
  • Concurrency and lifecycle: this path is concurrent with incremental_open; map access is now protected while writer work remains outside the map lock. Writer lifetime remains stable through _tablet_writers ownership and LoadChannel shared ownership during add_batch.
  • Parallel paths: the cloud add-batch path and shared _write_block_data path were checked; no separate uncovered lock issue was found.
  • Config, compatibility, persistence, FE-BE variables, and documentation: no new surface is introduced.
  • Observability and performance: unchanged and reasonable for this narrow fix; no heavy work was added under _tablet_writers_lock.
  • Tests: the race-specific coverage gap remains the already submitted inline concern at line 632; I did not duplicate it.

User focus: no additional user-provided focus points were present.

Subagent conclusions:

  • optimizer-rewrite: no candidates; optimizer/rewrite focus is not directly applicable, and the analogous semantic/parallel-path review found no new issue.
  • tests-session-config: no candidates; the only coverage concern found is already covered by the existing inline thread.
  • Convergence round 1 ended with both subagents replying NO_NEW_VALUABLE_FINDINGS for the same ledger/comment set.

@sollhui sollhui merged commit 30d27ca into apache:master Jun 23, 2026
34 of 35 checks passed
github-actions Bot pushed a commit that referenced this pull request Jun 23, 2026
### What problem does this PR solve?

Issue Number: None

Related PR: #57133

Problem Summary:

`BaseTabletsChannel::_write_block_data` can run concurrently with
`incremental_open` for the same tablets channel. `_tablet_writers` is an
`std::unordered_map` protected by `_tablet_writers_lock` when writers
are inserted, but the tablet load rowset info lookup read the map
without holding the lock.

A concurrent `emplace` may rehash `_tablet_writers`, so the unlocked
lookup can race with bucket reallocation. This patch protects the lookup
with `_tablet_writers_lock` and avoids using unordered_map iterators
after the lock is released. The actual writer operations still run
outside `_tablet_writers_lock`, so the lock remains scoped to the map
access.
github-actions Bot pushed a commit that referenced this pull request Jun 23, 2026
### What problem does this PR solve?

Issue Number: None

Related PR: #57133

Problem Summary:

`BaseTabletsChannel::_write_block_data` can run concurrently with
`incremental_open` for the same tablets channel. `_tablet_writers` is an
`std::unordered_map` protected by `_tablet_writers_lock` when writers
are inserted, but the tablet load rowset info lookup read the map
without holding the lock.

A concurrent `emplace` may rehash `_tablet_writers`, so the unlocked
lookup can race with bucket reallocation. This patch protects the lookup
with `_tablet_writers_lock` and avoids using unordered_map iterators
after the lock is released. The actual writer operations still run
outside `_tablet_writers_lock`, so the lock remains scoped to the map
access.
yiguolei pushed a commit that referenced this pull request Jun 24, 2026
…#64604 (#64752)

Cherry-picked from #64604

Co-authored-by: Refrain <refrain520cc@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/4.0.x dev/4.1.3-merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants