[fix](be) Protect tablet writer map lookup in load channel#64604
Conversation
### What problem does this PR solve? Issue Number: None Related PR: apache#57133 Problem Summary: In BaseTabletsChannel::_write_block_data, the add_batch path may run concurrently with incremental_open for the same tablets channel. The writer map is an std::unordered_map protected by _tablet_writers_lock when writers are inserted, but one lookup added for tablet load rowset info read the map without the lock. Concurrent emplace may rehash the map and invalidate the buckets being read. This patch protects that lookup with _tablet_writers_lock and avoids using unordered_map iterators after the lock is released. ### Release note None ### Check List (For Author) - Test: - Unit Test: ./run-be-ut.sh --run --filter=LoadStreamMapPoolTest.* - Manual test: ./build-support/check-format.sh - Manual test: git diff --check - Manual test: ./build-support/run-clang-tidy.sh --build-dir be/build_Debug (failed to analyze because the local compile database/PCH reported malformed or corrupted AST file) - Behavior changed: No - Does this need documentation: No
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
|
run buildall |
TPC-H: Total hot run time: 29572 ms |
TPC-DS: Total hot run time: 174975 ms |
ClickBench: Total hot run time: 25.28 s |
|
run buildall |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 29036 ms |
TPC-DS: Total hot run time: 175059 ms |
ClickBench: Total hot run time: 25.35 s |
|
run buildall |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
|
run buildall |
TPC-H: Total hot run time: 29444 ms |
TPC-DS: Total hot run time: 176635 ms |
ClickBench: Total hot run time: 25.23 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
TPC-H: Total hot run time: 29298 ms |
TPC-DS: Total hot run time: 175810 ms |
ClickBench: Total hot run time: 25.25 s |
|
/review |
There was a problem hiding this comment.
I found one issue: this concurrency fix does not add or cite targeted coverage for the add_batch/incremental_open race it fixes.
Critical checkpoint conclusions:
- Goal/code: the implementation matches the stated goal by protecting
_tablet_writerslookups and avoiding use of unordered_map iterators after releasing_tablet_writers_lock. - Scope: the change is small and focused in
be/src/load/channel/tablets_channel.cpp. - Concurrency: the changed map accesses are now locked, writer operations remain outside the map lock, and I did not find a new lock-order or writer-lifetime regression. The raw writer pointer remains stable across unordered_map rehash because the map stores
unique_ptrvalues and entries are not erased during channel lifetime. - Lifecycle/config/compatibility: no new config, protocol, persistence, or static lifecycle behavior is introduced.
- Parallel paths: the shared
_write_block_datapath is used by local and cloud tablets channels; no optimizer/rewrite path is touched. - Tests: there is no targeted test for the fixed race, and the cited
LoadStreamMapPoolTest.*does not exercise this code path. This is the inline comment below. - Style/CI:
git diff --checkis clean locally; GitHub reports formatter, compile, and BE UT success. Two regression contexts are failing, but I did not find evidence tying those failures to this one-file BE concurrency change. - Observability/performance: no new observability appears necessary; the added lock scope is limited to map lookup.
User focus: no additional user-provided review focus was supplied.
Subagent conclusions: optimizer-rewrite found no applicable optimizer/rewrite issue and converged with NO_NEW_VALUABLE_FINDINGS. tests-session-config proposed the missing targeted coverage candidate, which I verified and accepted as the inline comment. Convergence round 1 ended with both live subagents reporting NO_NEW_VALUABLE_FINDINGS for the same ledger/comment set.
|
run buildall |
|
PR approved by at least one committer and no changes requested. |
TPC-H: Total hot run time: 29519 ms |
TPC-DS: Total hot run time: 173500 ms |
ClickBench: Total hot run time: 25.33 s |
BE UT Coverage ReportIncrement line coverage Increment coverage report
|
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
|
/review |
There was a problem hiding this comment.
Reviewed the BaseTabletsChannel::_write_block_data lock fix and the surrounding load-channel paths.
I did not find additional correctness issues beyond the existing inline thread about missing targeted coverage for the concurrent incremental_open path. The failed macOS BE UT check appears to be a runner setup issue (JAVA version is 25, it must be JDK-17) rather than a PR code failure; the main compile and BE UT status contexts are green.
Critical checkpoint conclusions:
- Goal and behavior: the goal is clear, and the code addresses the
std::unordered_maprehash/iterator race by copying the writer pointer while holding_tablet_writers_lock. - Scope: the modification is small and focused on the unsafe
_tablet_writerslookups. - Concurrency and lifecycle: this path is concurrent with
incremental_open; map access is now protected while writer work remains outside the map lock. Writer lifetime remains stable through_tablet_writersownership andLoadChannelshared ownership duringadd_batch. - Parallel paths: the cloud add-batch path and shared
_write_block_datapath were checked; no separate uncovered lock issue was found. - Config, compatibility, persistence, FE-BE variables, and documentation: no new surface is introduced.
- Observability and performance: unchanged and reasonable for this narrow fix; no heavy work was added under
_tablet_writers_lock. - Tests: the race-specific coverage gap remains the already submitted inline concern at line 632; I did not duplicate it.
User focus: no additional user-provided focus points were present.
Subagent conclusions:
optimizer-rewrite: no candidates; optimizer/rewrite focus is not directly applicable, and the analogous semantic/parallel-path review found no new issue.tests-session-config: no candidates; the only coverage concern found is already covered by the existing inline thread.- Convergence round 1 ended with both subagents replying
NO_NEW_VALUABLE_FINDINGSfor the same ledger/comment set.
### What problem does this PR solve? Issue Number: None Related PR: #57133 Problem Summary: `BaseTabletsChannel::_write_block_data` can run concurrently with `incremental_open` for the same tablets channel. `_tablet_writers` is an `std::unordered_map` protected by `_tablet_writers_lock` when writers are inserted, but the tablet load rowset info lookup read the map without holding the lock. A concurrent `emplace` may rehash `_tablet_writers`, so the unlocked lookup can race with bucket reallocation. This patch protects the lookup with `_tablet_writers_lock` and avoids using unordered_map iterators after the lock is released. The actual writer operations still run outside `_tablet_writers_lock`, so the lock remains scoped to the map access.
### What problem does this PR solve? Issue Number: None Related PR: #57133 Problem Summary: `BaseTabletsChannel::_write_block_data` can run concurrently with `incremental_open` for the same tablets channel. `_tablet_writers` is an `std::unordered_map` protected by `_tablet_writers_lock` when writers are inserted, but the tablet load rowset info lookup read the map without holding the lock. A concurrent `emplace` may rehash `_tablet_writers`, so the unlocked lookup can race with bucket reallocation. This patch protects the lookup with `_tablet_writers_lock` and avoids using unordered_map iterators after the lock is released. The actual writer operations still run outside `_tablet_writers_lock`, so the lock remains scoped to the map access.
What problem does this PR solve?
Issue Number: None
Related PR: #57133
Problem Summary:
BaseTabletsChannel::_write_block_datacan run concurrently withincremental_openfor the same tablets channel._tablet_writersis anstd::unordered_mapprotected by_tablet_writers_lockwhen writers are inserted, but the tablet load rowset info lookup read the map without holding the lock.A concurrent
emplacemay rehash_tablet_writers, so the unlocked lookup can race with bucket reallocation. This patch protects the lookup with_tablet_writers_lockand avoids using unordered_map iterators after the lock is released. The actual writer operations still run outside_tablet_writers_lock, so the lock remains scoped to the map access.Release note
None
Check List (For Author)
Test
./run-be-ut.sh --run --filter=LoadStreamMapPoolTest.*./build-support/check-format.shgit diff --check./build-support/run-clang-tidy.sh --build-dir be/build_Debug; local analysis failed before checking the code because clang-tidy reportedmalformed or corrupted AST file: 'malformed block record in AST file'from the local compile database/PCH.Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)