[fix](fe) Bound length in MysqlProto.readLenEncodedString#63604
Conversation
### What problem does this PR solve? Problem Summary: MysqlProto.readLenEncodedString reads a length-encoded integer from the MySQL packet and passes it directly to new byte[(int) length] with no upper bound. The length is attacker-controlled (a 0xFE lead byte carries a full 8-byte value), and it is read before authentication from MysqlAuthPacket.readFrom (the auth-response field and the connection-attributes loop). A small handshake response can therefore request a ~2 GiB allocation, and a length with the high bit set casts to a negative size. Reject a length that is negative or larger than the bytes remaining in the buffer before allocating. A well-formed length-encoded string's payload always fits in the remaining buffer, so valid input is unaffected. ### Release note None ### Check List (For Author) - Test: Unit Test (MysqlProtoLenEncStringTest) - Behavior changed: No - Does this need documentation: No
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
/review |
|
run buildall |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
There was a problem hiding this comment.
No blocking issues found. The change is small and focused: readLenEncodedString now rejects negative or packet-overrunning length-encoded string sizes before allocation, and the added unit test covers oversized, negative-cast, and valid payload cases.
Critical checkpoint conclusions:
- Goal and proof: The PR addresses attacker-controlled length-encoded string allocation before authentication; the implementation meets the goal, with unit coverage for the key cases.
- Scope: The modification is minimal and localized to the shared parser used by both auth response and connection attributes.
- Concurrency and lifecycle: No new shared state, locking, threads, or lifecycle-sensitive objects are introduced.
- Configuration and compatibility: No new config, storage format, function symbol, or FE/BE protocol compatibility issue found. Invalid malformed packets now fail earlier, which is consistent with parser validation.
- Parallel paths: The common helper is used by the known length-encoded string readers, so both call paths are covered.
- Error handling: The new guard fails explicitly with
IllegalArgumentException; callers that already catch runtime parsing failures continue to do so, and handshake parsing already treats malformed packets as failed negotiation. - Tests: Added FE unit coverage is appropriate. I attempted
mvn -pl fe-core -Dtest=MysqlProtoLenEncStringTest test -Dskip.doc=true, but the local runner could not resolveorg.apache.doris:fe-foundation:1.2-SNAPSHOT, so the test did not execute here. - Observability, transaction/persistence, data writes, memory tracking: Not applicable to this parser-only FE change.
- Performance: The guard is constant-time and avoids pathological allocation; no additional performance issue found.
User focus: No additional user-provided review focus was present.
|
run buildall |
FE UT Coverage ReportIncrement line coverage |
1 similar comment
FE UT Coverage ReportIncrement line coverage |
FE Regression Coverage ReportIncrement line coverage |
TPC-H: Total hot run time: 29035 ms |
TPC-DS: Total hot run time: 171138 ms |
TPC-H: Total hot run time: 28919 ms |
TPC-DS: Total hot run time: 171439 ms |
|
/review |
There was a problem hiding this comment.
Review opinion: no blocking issues found.
Critical checkpoint conclusions:
- Goal and proof: The PR prevents attacker-controlled MySQL length-encoded string lengths from being cast directly to array sizes when the declared payload length exceeds the packet bytes remaining. The implementation accomplishes this for the shared
MysqlProto.readLenEncodedStringhelper, and adds focused unit tests for oversized, negative-cast, and valid payload cases. - Scope/focus: The modification is small and localized to the parser helper plus tests.
- Concurrency/lifecycle: No new shared state, locks, threads, or lifecycle-sensitive objects are introduced.
- Configuration/compatibility: No configuration, storage format, thrift/protocol symbol, or rolling-upgrade compatibility concern found. Valid MySQL length-encoded string payloads remain accepted; malformed packets now fail earlier.
- Parallel paths: The shared helper is used by the auth response, connect attributes, and OIDC token extraction paths, so the bound is applied consistently to current FE callers.
- Error handling: The new
IllegalArgumentExceptionis an unchecked parse failure. In the handshake path it is contained byAcceptListener.handleConnectioncleanup; OIDC extraction already catchesRuntimeExceptionand falls back to the raw auth response. - Tests: Added unit coverage targets the allocation-risk cases and a valid case. I attempted
./run-fe-ut.sh --run org.apache.doris.mysql.MysqlProtoLenEncStringTest, but the run failed during generated-code setup becausegensrc/proto/Makefilereportedthirdparty/installed/bin/protoc: No such file or directory, so the test did not execute in this runner. - Observability/performance: No new observability need. The added bound is constant-time and avoids pathological allocations.
- Transaction/persistence/data writes: Not applicable.
- User focus: No additional user-provided review focus was specified.
Residual risk: The broader handshake parser still relies on outer connection cleanup for malformed packet runtime exceptions, and this PR does not add end-to-end negotiation tests. I did not find that to be a blocker for this narrowly scoped fix.
FE Regression Coverage ReportIncrement line coverage |
|
/review |
There was a problem hiding this comment.
Review result: no blocking issues found.
Critical checkpoint conclusions:
- Goal and proof: The PR adds a remaining-bytes bound for MySQL length-encoded string payloads, preventing attacker-controlled length values from being cast into oversized or negative byte-array allocations. The new unit test covers oversized 0xFE/8-byte lengths, negative int-cast lengths, and a valid payload.
- Scope: The change is small and focused in
MysqlProto.readLenEncodedString, with a targeted test class. - Concurrency and lifecycle: No shared mutable state, locking, thread lifecycle, or static initialization behavior is introduced.
- Configuration and compatibility: No config items, storage formats, function symbols, or FE/BE protocol fields are added. Valid MySQL length-encoded strings remain accepted; malformed strings now fail before allocation.
- Parallel paths: Existing callers of
readLenEncodedStringuse the same helper, so the bound applies consistently to auth response, connection attributes, and OIDC token extraction parsing. - Error handling: The new malformed-packet failure is explicit. Existing callers already operate on packet parsing helpers that may throw runtime exceptions on malformed buffers; the OIDC extractor catches runtime exceptions and falls back as before.
- Tests: Targeted tests are present. I attempted
mvn -pl fe-core -Dtest=MysqlProtoLenEncStringTest test -Dskip.doc=true, which failed becausefe-foundationSNAPSHOT was not resolved when runningfe-corealone. I retried withmvn -pl fe-core -am -Dtest=MysqlProtoLenEncStringTest test -Dskip.doc=true -DfailIfNoTests=false, but the runner lacksthirdparty/installed/bin/thrift, so the reactor stopped infe-thriftbefore reachingfe-core. - Observability: No additional logs or metrics are needed for this local packet validation helper.
- Transaction/persistence/data writes: Not applicable.
- Performance: The added check is O(1), before allocation, and avoids pathological allocation behavior.
User focus points: No additional user-provided review focus was present.
### What problem does this PR solve? Issue Number: close #63603 Problem Summary: `MysqlProto.readLenEncodedString` reads a length-encoded integer and passes it straight to `new byte[(int) length]` with no bound. The length is fully attacker-controlled (a `0xFE` lead byte carries an 8-byte value), and it is read before authentication from `MysqlAuthPacket.readFrom` (the auth-response field at `MysqlAuthPacket.java:93` and the connection-attributes loop at `MysqlAuthPacket.java:110-118`). A small handshake response can therefore request a ~2 GiB allocation, and a length with the high bit set casts to a negative size (`NegativeArraySizeException`). This PR rejects a length that is negative or larger than the bytes remaining in the buffer before allocating. A well-formed length-encoded string's payload always fits in the remaining buffer, so valid input is unaffected. One guard covers both reach paths.
### What problem does this PR solve? Issue Number: close #63603 Problem Summary: `MysqlProto.readLenEncodedString` reads a length-encoded integer and passes it straight to `new byte[(int) length]` with no bound. The length is fully attacker-controlled (a `0xFE` lead byte carries an 8-byte value), and it is read before authentication from `MysqlAuthPacket.readFrom` (the auth-response field at `MysqlAuthPacket.java:93` and the connection-attributes loop at `MysqlAuthPacket.java:110-118`). A small handshake response can therefore request a ~2 GiB allocation, and a length with the high bit set casts to a negative size (`NegativeArraySizeException`). This PR rejects a length that is negative or larger than the bytes remaining in the buffer before allocating. A well-formed length-encoded string's payload always fits in the remaining buffer, so valid input is unaffected. One guard covers both reach paths.
What problem does this PR solve?
Issue Number: close #63603
Problem Summary:
MysqlProto.readLenEncodedStringreads a length-encoded integer and passes it straight tonew byte[(int) length]with no bound. The length is fully attacker-controlled (a0xFElead byte carries an 8-byte value), and it is read before authentication fromMysqlAuthPacket.readFrom(the auth-response field atMysqlAuthPacket.java:93and the connection-attributes loop atMysqlAuthPacket.java:110-118). A small handshake response can therefore requesta ~2 GiB allocation, and a length with the high bit set casts to a negative size (
NegativeArraySizeException).This PR rejects a length that is negative or larger than the bytes remaining in the buffer before allocating. A well-formed length-encoded string's payload always fits in the remaining buffer, so valid input is unaffected. One guard covers both reach paths.
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
The Unit Test is
MysqlProtoLenEncStringTest(added in this PR): oversized and negative-cast lengths are rejected withIllegalArgumentException; a normal short length-encoded string still parses.Check List (For Reviewer who merge this PR)