Skip to content

[test](jdbc) Add SQL Server bit/boolean predicate pushdown regression test#64756

Merged
morningman merged 2 commits into
apache:masterfrom
morningman:regression-sqlserver-bit-bool-pushdown-64464
Jun 24, 2026
Merged

[test](jdbc) Add SQL Server bit/boolean predicate pushdown regression test#64756
morningman merged 2 commits into
apache:masterfrom
morningman:regression-sqlserver-bit-bool-pushdown-64464

Conversation

@morningman

Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: #64464

Related PR: N/A

Problem Summary:

When querying a SQL Server JDBC catalog, a predicate on a bit column such as WHERE bit_value = '1' is folded to a boolean literal during analysis. The pushed-down predicate must be rendered as an integer (= 1 / = 0) for SQL Server, never the TRUE / FALSE keyword: SQL Server has no boolean literal and reports SQLServerException: Invalid column name 'TRUE' (see #64464).

On current master this is already handled correctly. The JDBC pushdown path was refactored to the connector SPI (PluginDrivenScanNode -> ExprToConnectorExpressionConverter -> JdbcQueryBuilder), and JdbcQueryBuilder.formatBooleanLiteral() renders booleans per dialect (SQLSERVER / ORACLE / OCEANBASE_ORACLE / DB2 -> 1/0, others -> TRUE/FALSE). JdbcQueryBuilderTest already unit-tests this.

What was missing is the end-to-end regression test that the issue triage explicitly asked for. This PR adds it to the SQL Server docker JDBC suite (test_sqlserver_jdbc_catalog.groovy), covering bit_value = '1', bit_value = '0' and bit_value in ('1', '0'): it asserts via explain that the pushed remote SQL renders [bit_value] = 1 / [bit_value] = 0, and executes the queries end-to-end (which throw Invalid column name 'TRUE' on the buggy path).

Note: branch-4.0 still uses the old JdbcScanNode / ExprToSqlVisitor path, which renders the dialect-agnostic TRUE/FALSE and is what triggers the bug reported in #64464. That branch needs a separate code fix; this regression test alone would fail there and is not sufficient on its own.

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test.
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

🤖 Generated with Claude Code

… test

SQL Server `bit` maps to Doris BOOLEAN, so a predicate like
`bit_value = '1'` is folded to a boolean literal during analysis. When
pushed down it must be rendered as an integer (`1`/`0`) for SQL Server,
never the `TRUE`/`FALSE` keyword, otherwise SQL Server reports
"Invalid column name 'TRUE'".

On master the connector pushdown path (PluginDrivenScanNode ->
ExprToConnectorExpressionConverter -> JdbcQueryBuilder) already renders
this correctly via JdbcQueryBuilder.formatBooleanLiteral
(SQLSERVER/ORACLE/OCEANBASE_ORACLE/DB2 -> 1/0). This adds the missing
end-to-end regression test in the SQL Server docker JDBC suite, as
requested in the issue triage, to guard the full chain.

Related apache#64464

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@hello-stephen

Copy link
Copy Markdown
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@morningman

Copy link
Copy Markdown
Contributor Author

run buildall

Gabriel39
Gabriel39 previously approved these changes Jun 24, 2026
@github-actions github-actions Bot added the approved Indicates a PR has been approved by one committer. label Jun 24, 2026
@github-actions

Copy link
Copy Markdown
Contributor

PR approved by at least one committer and no changes requested.

@github-actions

Copy link
Copy Markdown
Contributor

PR approved by anyone and no changes requested.

@morningman

Copy link
Copy Markdown
Contributor Author

/review

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found one regression-test coverage gap in the new SQL Server JDBC bit predicate test.

Critical checkpoint conclusions:

  • Goal/test proof: the equality cases are covered by explain assertions and execution, but the new IN case does not prove the dedicated ConnectorIn/JdbcQueryBuilder.inToSql() path.
  • Scope: the PR is otherwise small and focused, with one changed regression suite file.
  • Concurrency, lifecycle, configuration, compatibility, transactions, persistence, observability: no production code or config changes in this PR, so no new risk found in those areas.
  • Parallel paths: equality and IN predicates use different rendering paths; the separate IN path needs direct end-to-end coverage.
  • Test result handling: no .out update is needed for the added explain/plain sql checks, but the IN execution should be selective if it is meant to prove filtering behavior.

Subagent conclusions: optimizer-rewrite found no optimizer/rewrite, join, aggregate, or parallel-plan issue. tests-session-config proposed TSC-1, accepted as MAIN-1 and submitted inline. No duplicates were present because the PR had no existing inline review threads. Convergence round 2 ended with both live subagents replying NO_NEW_VALUABLE_FINDINGS for the same current ledger/comment set.

User focus: no additional user-provided review focus was supplied.

I did not run the external SQL Server regression suite in this runner.

// "Invalid column name 'TRUE'" against SQL Server.
sql """ select * from test_binary where bit_value = '1' order by id; """
sql """ select * from test_binary where bit_value = '0' order by id; """
sql """ select * from test_binary where bit_value in ('1', '0') order by id; """

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This IN case does not currently prove that the SQL Server IN predicate is pushed down or rendered with SQL Server boolean literals. The two equality cases above assert the remote SQL, but this line only executes the query and bit_value in ('1', '0') matches all existing non-null rows, so it would still pass if the IN predicate were dropped instead of being rendered through JdbcQueryBuilder.inToSql(). Add an explain assertion for an actual IN form, for example checking the remote SQL contains an IN predicate with 1/0 literals, and make the execution check selective with order_qt so the test also fails if the filter is not applied.

yiguolei pushed a commit that referenced this pull request Jun 24, 2026
…1/0 instead of TRUE/FALSE (#64757)

### What problem does this PR solve?

Issue Number: #64464

Related PR: #64756 (master regression test)

Problem Summary:

When querying a SQL Server JDBC catalog, a predicate on a `bit` column
such as `WHERE bit_value = '1'` is folded to a boolean literal during
analysis. On this branch the JDBC pushdown path
(`JdbcScanNode.conjunctExprToString()` -> `Expr.toExternalSql()`)
renders `BoolLiteral` as the dialect-agnostic `TRUE`/`FALSE` keyword, so
the pushed-down SQL becomes:

```sql
... WHERE (([bit_value] = TRUE))
```

SQL Server has no boolean literal and treats `TRUE` as an identifier:

```
SQLServerException: Invalid column name 'TRUE'.
```

(2.1.10 did not have this issue; it pushed `= 1`.)

This PR renders boolean literals as `1`/`0` for databases without a
native boolean type (`SQLSERVER`, `ORACLE`, `OCEANBASE_ORACLE`),
mirroring the existing dialect-specific date-literal handling in the
same method. `MySQL` / `PostgreSQL` / `Trino` keep `TRUE`/`FALSE`
(PostgreSQL requires the keyword for its strict boolean type).

On master this is already handled by the refactored connector SPI
(`JdbcQueryBuilder.formatBooleanLiteral()`); this PR is the equivalent
fix for the `branch-4.1` `JdbcScanNode` path.

### Release note

Fix SQL Server / Oracle JDBC catalog query failure ("Invalid column name
'TRUE'") when filtering a boolean/`bit` column.

### Check List (For Author)

- Test
    - [x] Regression test
    - [x] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test.

- Behavior changed:
    - [x] No.
    - [ ] Yes.

- Does this need documentation?
    - [x] No.
    - [ ] Yes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…hecks selective

Address review on apache#64464: the previous `bit_value in ('1', '0')` check
matched every non-null row and asserted nothing, so it would pass even if
the IN predicate were dropped. Add an explain assertion that the pushed
remote SQL renders the IN with integer literals ([bit_value] IN (1, 0)),
and replace the non-asserting executions with selective order_qt cases
(`= '0'` -> id=1, `= '1'` -> id=2,3) that fail if the filter is not applied.

Related apache#64464

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot removed the approved Indicates a PR has been approved by one committer. label Jun 24, 2026
@morningman

Copy link
Copy Markdown
Contributor Author

run buildall

@morningman

Copy link
Copy Markdown
Contributor Author

/review

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the current head for PR #64756.

I did not find any new non-duplicative issue to raise. The current diff adds SQL Server bit predicate coverage with remote SQL explain assertions for = 1, = 0, and IN (1, 0), and the expected output includes deterministic order_qt checks for the equality and IN cases. The earlier concern about the IN execution check not proving pushdown is already covered by the existing inline thread and appears addressed by the new explain assertion in this head.

Critical checkpoint conclusions:

  • Goal/test proof: The PR is test-only and now covers the SQL Server boolean literal rendering regression path it targets.
  • Scope: The change is small and limited to the SQL Server JDBC regression suite and its expected output.
  • Concurrency/lifecycle/config/compatibility/persistence/write paths: Not applicable; no production code or configuration is changed.
  • Parallel paths: The relevant JDBC explain and scan paths use JdbcQueryBuilder; no separate join/aggregate rendering path was changed by this PR.
  • Test coverage/results: Added order_qt output sections match the suite queries and the SQL Server fixture state.
  • Observability/performance: Not applicable for this test-only change.

User focus: No additional user-provided review focus.

Subagent conclusions: optimizer-rewrite and tests-session-config both reported no new candidates in the initial review pass. No candidate became an inline comment; the existing IN coverage concern was treated as a duplicate of the existing thread and verified against the current diff. Convergence round 2 ended with both live subagents returning NO_NEW_VALUABLE_FINDINGS for the same latest ledger/comment set.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed the current head for PR #64756.

I did not find any new non-duplicative issue to raise. The current diff adds SQL Server bit predicate coverage with remote SQL explain assertions for = 1, = 0, and IN (1, 0), and the expected output includes deterministic order_qt checks for the equality and IN cases. The earlier concern about the IN execution check not proving pushdown is already covered by the existing inline thread and appears addressed by the new explain assertion in this head.

Critical checkpoint conclusions:

  • Goal/test proof: The PR is test-only and now covers the SQL Server boolean literal rendering regression path it targets.
  • Scope: The change is small and limited to the SQL Server JDBC regression suite and its expected output.
  • Concurrency/lifecycle/config/compatibility/persistence/write paths: Not applicable; no production code or configuration is changed.
  • Parallel paths: The relevant JDBC explain and scan paths use JdbcQueryBuilder; no separate join/aggregate rendering path was changed by this PR.
  • Test coverage/results: Added order_qt output sections match the suite queries and the SQL Server fixture state.
  • Observability/performance: Not applicable for this test-only change.

User focus: No additional user-provided review focus.

Subagent conclusions: optimizer-rewrite and tests-session-config both reported no new candidates in the initial review pass. No candidate became an inline comment; the existing IN coverage concern was treated as a duplicate of the existing thread and verified against the current diff. The final convergence round recorded in the shared ledger ended with both live subagents returning NO_NEW_VALUABLE_FINDINGS for the same latest ledger/comment set.

morningman added a commit that referenced this pull request Jun 24, 2026
…of TRUE/FALSE (#64760)

### What problem does this PR solve?

Issue Number: #64464

Related PR: #64757 (branch-4.1), #64756 (master
regression test)

Problem Summary:

When querying a SQL Server JDBC catalog, a predicate on a `bit` column
such as `WHERE bit_value = '1'` is folded to a boolean literal during
analysis. On this branch the JDBC pushdown path
(`JdbcScanNode.conjunctExprToString()` -> `Expr.toExternalSql()`)
renders `BoolLiteral` as the dialect-agnostic `TRUE`/`FALSE` keyword, so
the pushed-down SQL becomes:

```sql
... WHERE (([bit_value] = TRUE))
```

SQL Server has no boolean literal and treats `TRUE` as an identifier:

```
SQLServerException: Invalid column name 'TRUE'.
```

(2.1.10 did not have this issue; it pushed `= 1`.)

This PR renders boolean literals as `1`/`0` for databases without a
native boolean type (`SQLSERVER`, `ORACLE`, `OCEANBASE_ORACLE`),
mirroring the existing dialect-specific date-literal handling in the
same method. `MySQL` / `PostgreSQL` / `Trino` keep `TRUE`/`FALSE`
(PostgreSQL requires the keyword for its strict boolean type).

This is the `branch-4.0` backport of the same fix (`branch-4.1`:
#64757). On master the JDBC pushdown was refactored to the
connector SPI and already renders booleans per dialect via
`JdbcQueryBuilder.formatBooleanLiteral()`.

### Release note

Fix SQL Server / Oracle JDBC catalog query failure ("Invalid column name
'TRUE'") when filtering a boolean/`bit` column.
@morningman morningman merged commit 5355fbe into apache:master Jun 24, 2026
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants