[test](jdbc) Add SQL Server bit/boolean predicate pushdown regression test#64756
Conversation
… test SQL Server `bit` maps to Doris BOOLEAN, so a predicate like `bit_value = '1'` is folded to a boolean literal during analysis. When pushed down it must be rendered as an integer (`1`/`0`) for SQL Server, never the `TRUE`/`FALSE` keyword, otherwise SQL Server reports "Invalid column name 'TRUE'". On master the connector pushdown path (PluginDrivenScanNode -> ExprToConnectorExpressionConverter -> JdbcQueryBuilder) already renders this correctly via JdbcQueryBuilder.formatBooleanLiteral (SQLSERVER/ORACLE/OCEANBASE_ORACLE/DB2 -> 1/0). This adds the missing end-to-end regression test in the SQL Server docker JDBC suite, as requested in the issue triage, to guard the full chain. Related apache#64464 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
|
PR approved by at least one committer and no changes requested. |
|
PR approved by anyone and no changes requested. |
|
/review |
There was a problem hiding this comment.
I found one regression-test coverage gap in the new SQL Server JDBC bit predicate test.
Critical checkpoint conclusions:
- Goal/test proof: the equality cases are covered by
explainassertions and execution, but the newINcase does not prove the dedicatedConnectorIn/JdbcQueryBuilder.inToSql()path. - Scope: the PR is otherwise small and focused, with one changed regression suite file.
- Concurrency, lifecycle, configuration, compatibility, transactions, persistence, observability: no production code or config changes in this PR, so no new risk found in those areas.
- Parallel paths: equality and
INpredicates use different rendering paths; the separateINpath needs direct end-to-end coverage. - Test result handling: no
.outupdate is needed for the addedexplain/plainsqlchecks, but theINexecution should be selective if it is meant to prove filtering behavior.
Subagent conclusions: optimizer-rewrite found no optimizer/rewrite, join, aggregate, or parallel-plan issue. tests-session-config proposed TSC-1, accepted as MAIN-1 and submitted inline. No duplicates were present because the PR had no existing inline review threads. Convergence round 2 ended with both live subagents replying NO_NEW_VALUABLE_FINDINGS for the same current ledger/comment set.
User focus: no additional user-provided review focus was supplied.
I did not run the external SQL Server regression suite in this runner.
| // "Invalid column name 'TRUE'" against SQL Server. | ||
| sql """ select * from test_binary where bit_value = '1' order by id; """ | ||
| sql """ select * from test_binary where bit_value = '0' order by id; """ | ||
| sql """ select * from test_binary where bit_value in ('1', '0') order by id; """ |
There was a problem hiding this comment.
This IN case does not currently prove that the SQL Server IN predicate is pushed down or rendered with SQL Server boolean literals. The two equality cases above assert the remote SQL, but this line only executes the query and bit_value in ('1', '0') matches all existing non-null rows, so it would still pass if the IN predicate were dropped instead of being rendered through JdbcQueryBuilder.inToSql(). Add an explain assertion for an actual IN form, for example checking the remote SQL contains an IN predicate with 1/0 literals, and make the execution check selective with order_qt so the test also fails if the filter is not applied.
…1/0 instead of TRUE/FALSE (#64757) ### What problem does this PR solve? Issue Number: #64464 Related PR: #64756 (master regression test) Problem Summary: When querying a SQL Server JDBC catalog, a predicate on a `bit` column such as `WHERE bit_value = '1'` is folded to a boolean literal during analysis. On this branch the JDBC pushdown path (`JdbcScanNode.conjunctExprToString()` -> `Expr.toExternalSql()`) renders `BoolLiteral` as the dialect-agnostic `TRUE`/`FALSE` keyword, so the pushed-down SQL becomes: ```sql ... WHERE (([bit_value] = TRUE)) ``` SQL Server has no boolean literal and treats `TRUE` as an identifier: ``` SQLServerException: Invalid column name 'TRUE'. ``` (2.1.10 did not have this issue; it pushed `= 1`.) This PR renders boolean literals as `1`/`0` for databases without a native boolean type (`SQLSERVER`, `ORACLE`, `OCEANBASE_ORACLE`), mirroring the existing dialect-specific date-literal handling in the same method. `MySQL` / `PostgreSQL` / `Trino` keep `TRUE`/`FALSE` (PostgreSQL requires the keyword for its strict boolean type). On master this is already handled by the refactored connector SPI (`JdbcQueryBuilder.formatBooleanLiteral()`); this PR is the equivalent fix for the `branch-4.1` `JdbcScanNode` path. ### Release note Fix SQL Server / Oracle JDBC catalog query failure ("Invalid column name 'TRUE'") when filtering a boolean/`bit` column. ### Check List (For Author) - Test - [x] Regression test - [x] Unit Test - [ ] Manual test (add detailed scripts or steps below) - [ ] No need to test or manual test. - Behavior changed: - [x] No. - [ ] Yes. - Does this need documentation? - [x] No. - [ ] Yes. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…hecks selective Address review on apache#64464: the previous `bit_value in ('1', '0')` check matched every non-null row and asserted nothing, so it would pass even if the IN predicate were dropped. Add an explain assertion that the pushed remote SQL renders the IN with integer literals ([bit_value] IN (1, 0)), and replace the non-asserting executions with selective order_qt cases (`= '0'` -> id=1, `= '1'` -> id=2,3) that fail if the filter is not applied. Related apache#64464 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
run buildall |
|
/review |
There was a problem hiding this comment.
Reviewed the current head for PR #64756.
I did not find any new non-duplicative issue to raise. The current diff adds SQL Server bit predicate coverage with remote SQL explain assertions for = 1, = 0, and IN (1, 0), and the expected output includes deterministic order_qt checks for the equality and IN cases. The earlier concern about the IN execution check not proving pushdown is already covered by the existing inline thread and appears addressed by the new explain assertion in this head.
Critical checkpoint conclusions:
- Goal/test proof: The PR is test-only and now covers the SQL Server boolean literal rendering regression path it targets.
- Scope: The change is small and limited to the SQL Server JDBC regression suite and its expected output.
- Concurrency/lifecycle/config/compatibility/persistence/write paths: Not applicable; no production code or configuration is changed.
- Parallel paths: The relevant JDBC explain and scan paths use
JdbcQueryBuilder; no separate join/aggregate rendering path was changed by this PR. - Test coverage/results: Added
order_qtoutput sections match the suite queries and the SQL Server fixture state. - Observability/performance: Not applicable for this test-only change.
User focus: No additional user-provided review focus.
Subagent conclusions: optimizer-rewrite and tests-session-config both reported no new candidates in the initial review pass. No candidate became an inline comment; the existing IN coverage concern was treated as a duplicate of the existing thread and verified against the current diff. Convergence round 2 ended with both live subagents returning NO_NEW_VALUABLE_FINDINGS for the same latest ledger/comment set.
There was a problem hiding this comment.
Reviewed the current head for PR #64756.
I did not find any new non-duplicative issue to raise. The current diff adds SQL Server bit predicate coverage with remote SQL explain assertions for = 1, = 0, and IN (1, 0), and the expected output includes deterministic order_qt checks for the equality and IN cases. The earlier concern about the IN execution check not proving pushdown is already covered by the existing inline thread and appears addressed by the new explain assertion in this head.
Critical checkpoint conclusions:
- Goal/test proof: The PR is test-only and now covers the SQL Server boolean literal rendering regression path it targets.
- Scope: The change is small and limited to the SQL Server JDBC regression suite and its expected output.
- Concurrency/lifecycle/config/compatibility/persistence/write paths: Not applicable; no production code or configuration is changed.
- Parallel paths: The relevant JDBC explain and scan paths use
JdbcQueryBuilder; no separate join/aggregate rendering path was changed by this PR. - Test coverage/results: Added
order_qtoutput sections match the suite queries and the SQL Server fixture state. - Observability/performance: Not applicable for this test-only change.
User focus: No additional user-provided review focus.
Subagent conclusions: optimizer-rewrite and tests-session-config both reported no new candidates in the initial review pass. No candidate became an inline comment; the existing IN coverage concern was treated as a duplicate of the existing thread and verified against the current diff. The final convergence round recorded in the shared ledger ended with both live subagents returning NO_NEW_VALUABLE_FINDINGS for the same latest ledger/comment set.
…of TRUE/FALSE (#64760) ### What problem does this PR solve? Issue Number: #64464 Related PR: #64757 (branch-4.1), #64756 (master regression test) Problem Summary: When querying a SQL Server JDBC catalog, a predicate on a `bit` column such as `WHERE bit_value = '1'` is folded to a boolean literal during analysis. On this branch the JDBC pushdown path (`JdbcScanNode.conjunctExprToString()` -> `Expr.toExternalSql()`) renders `BoolLiteral` as the dialect-agnostic `TRUE`/`FALSE` keyword, so the pushed-down SQL becomes: ```sql ... WHERE (([bit_value] = TRUE)) ``` SQL Server has no boolean literal and treats `TRUE` as an identifier: ``` SQLServerException: Invalid column name 'TRUE'. ``` (2.1.10 did not have this issue; it pushed `= 1`.) This PR renders boolean literals as `1`/`0` for databases without a native boolean type (`SQLSERVER`, `ORACLE`, `OCEANBASE_ORACLE`), mirroring the existing dialect-specific date-literal handling in the same method. `MySQL` / `PostgreSQL` / `Trino` keep `TRUE`/`FALSE` (PostgreSQL requires the keyword for its strict boolean type). This is the `branch-4.0` backport of the same fix (`branch-4.1`: #64757). On master the JDBC pushdown was refactored to the connector SPI and already renders booleans per dialect via `JdbcQueryBuilder.formatBooleanLiteral()`. ### Release note Fix SQL Server / Oracle JDBC catalog query failure ("Invalid column name 'TRUE'") when filtering a boolean/`bit` column.
What problem does this PR solve?
Issue Number: #64464
Related PR: N/A
Problem Summary:
When querying a SQL Server JDBC catalog, a predicate on a
bitcolumn such asWHERE bit_value = '1'is folded to a boolean literal during analysis. The pushed-down predicate must be rendered as an integer (= 1/= 0) for SQL Server, never theTRUE/FALSEkeyword: SQL Server has no boolean literal and reportsSQLServerException: Invalid column name 'TRUE'(see #64464).On current master this is already handled correctly. The JDBC pushdown path was refactored to the connector SPI (
PluginDrivenScanNode->ExprToConnectorExpressionConverter->JdbcQueryBuilder), andJdbcQueryBuilder.formatBooleanLiteral()renders booleans per dialect (SQLSERVER/ORACLE/OCEANBASE_ORACLE/DB2->1/0, others ->TRUE/FALSE).JdbcQueryBuilderTestalready unit-tests this.What was missing is the end-to-end regression test that the issue triage explicitly asked for. This PR adds it to the SQL Server docker JDBC suite (
test_sqlserver_jdbc_catalog.groovy), coveringbit_value = '1',bit_value = '0'andbit_value in ('1', '0'): it asserts viaexplainthat the pushed remote SQL renders[bit_value] = 1/[bit_value] = 0, and executes the queries end-to-end (which throwInvalid column name 'TRUE'on the buggy path).Note:
branch-4.0still uses the oldJdbcScanNode/ExprToSqlVisitorpath, which renders the dialect-agnosticTRUE/FALSEand is what triggers the bug reported in #64464. That branch needs a separate code fix; this regression test alone would fail there and is not sufficient on its own.Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
🤖 Generated with Claude Code