DRIVER-153: negotiate and implement SCYLLA_USE_METADATA_ID extension by nikagra · Pull Request #770 · scylladb/python-driver

nikagra · 2026-03-26T19:06:42Z

Summary

Implements the SCYLLA_USE_METADATA_ID Scylla CQL protocol extension (DRIVER-153), which backports the prepared-statement metadata-ID mechanism from CQL v5 to earlier protocol versions.

When the extension is negotiated:

The server includes a result metadata hash in the PREPARE response
The driver sends that hash back with every EXECUTE request, allowing the server to skip sending full result metadata on every response (skip_meta=True)
If the result schema has changed, the server sets the METADATA_CHANGED flag and includes the new metadata ID + new column metadata in the response — the driver picks this up and updates its cached metadata automatically

Changes

cassandra/protocol_features.py

Add USE_METADATA_ID = "SCYLLA_USE_METADATA_ID" constant and use_metadata_id field to ProtocolFeatures
Parse the extension from the SUPPORTED frame; include it in STARTUP when present

cassandra/protocol.py

Bug fix: _write_query_params now actually writes _SKIP_METADATA_FLAG on the wire — it was stored on _QueryMessage but never sent (effectively dead code)
recv_results_prepared: read result_metadata_id for Scylla extension (pre-v5) in addition to standard CQL v5+
ExecuteMessage.send_body: send result_metadata_id for Scylla extension (pre-v5) when set

cassandra/cluster.py

skip_meta is now True only when safe: CQL v5 is used or SCYLLA_USE_METADATA_ID was negotiated (proxied by a non-None result_metadata_id on the prepared statement). Otherwise False — always fetch full metadata (safest option).
_set_result: when the EXECUTE response contains a new result_metadata_id (METADATA_CHANGED), update prepared_statement.result_metadata and result_metadata_id to keep the cached metadata in sync

Test plan

Unit tests written first (TDD) — 7 new tests covering feature negotiation, STARTUP options, skip_meta flag encoding, Scylla metadata_id in ExecuteMessage (v4), and PREPARE response decoding with/without extension
Full unit test suite passes (627 passed, 97 skipped)
Integration tests against a Scylla node with the extension: verify that schema changes after PREPARE are detected and metadata is updated without re-preparation

Copilot

Pull request overview

Implements negotiation and support for Scylla’s SCYLLA_USE_METADATA_ID protocol extension to enable metadata-id based skip_meta behavior (backporting CQL v5 prepared-statement metadata-id semantics to earlier protocol versions).

Changes:

Adds SCYLLA_USE_METADATA_ID parsing from SUPPORTED and includes it in STARTUP when negotiated.
Extends protocol encode/decode to read/write result_metadata_id for PREPARE/EXECUTE on pre-v5 when the extension is used, and fixes on-wire encoding of _SKIP_METADATA_FLAG.
Updates execution/result handling to conditionally use skip_meta and to refresh cached prepared metadata when the server reports metadata changes.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`cassandra/protocol_features.py`	Adds the `SCYLLA_USE_METADATA_ID` feature flag and includes it in negotiated STARTUP options.
`cassandra/protocol.py`	Writes `_SKIP_METADATA_FLAG` in query params; adds pre-v5 extension handling for `result_metadata_id` in PREPARE/EXECUTE.
`cassandra/cluster.py`	Adjusts when `skip_meta` is enabled and updates cached prepared metadata/id on `METADATA_CHANGED` responses.
`tests/unit/test_protocol_features.py`	Adds unit tests for feature parsing and STARTUP option inclusion.
`tests/unit/test_protocol.py`	Adds unit tests for skip-meta flag encoding and metadata-id handling in pre-v5 PREPARE/EXECUTE paths.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-26T19:12:24Z

+        elif self.result_metadata_id is not None:
+            write_string(f, self.result_metadata_id)


ExecuteMessage.send_body() now writes result_metadata_id for protocol versions that don’t use standard prepared-metadata (pre-v5) whenever self.result_metadata_id is non-None. For pre-v5 this extra field is only valid when SCYLLA_USE_METADATA_ID was negotiated on that connection; otherwise it changes the wire layout (the server will interpret the metadata id bytes as the query parameters) and the request will fail.

To make this safe, ensure the decision to include result_metadata_id is gated by the negotiated feature (e.g., add an explicit use_metadata_id/send_result_metadata_id flag on the message that the caller sets based on connection.features.use_metadata_id, or guarantee centrally that result_metadata_id is cleared unless the extension is active for that connection).

Suggested change

elif self.result_metadata_id is not None:

write_string(f, self.result_metadata_id)

Agreed. The elif self.result_metadata_id is not None path in send_body is now only reached when the caller explicitly set the field — which only happens in _query() after confirming connection.features.use_metadata_id (or CQL v5). For any connection that didn't negotiate the extension, result_metadata_id remains None and the branch is never taken, so the wire layout is unaffected.

There is still a problem here. If the result id feature is negotiated on the connection, then you need to ALWAYS send some result metadata id in EXECUTE. Skipping the write_string will result in a protocol error.
Your use_metadata_id may be False even if extension was negotiated, if the server decided to skip the metadata in PREPARED response. In such case, you'll skip writing the id here, and encounter protocol error.

Even if you fix this specific case, there is still possibility of mixed cluster, with some nodes supporting the extension. In that case result_metadata_id will be None, and if you send to a node that has the extension negotiated, you'll again not send the id and encounter protocol error.

To sum up: this serialization here should check if feature is negotiated, and base sending this field only on that.

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

mykaul · 2026-03-29T16:42:19Z

I'm not sure where, but we should document this - with reference mainly to the scylladb docs about this feature.

nikagra · 2026-03-30T17:00:35Z

@mykaul Documentation I'm aware of is MetadataId extension in CQLv4 Requirement Document

dkropachev

One blocking correctness issue below: skip_meta is being enabled for prepared statements that can still have empty/absent cached result metadata.

dkropachev · 2026-04-14T14:15:05Z

+                has_result_metadata_id = self.prepared_statement.result_metadata_id is not None
+                use_metadata_id = has_result_metadata_id and (
+                    ProtocolVersion.uses_prepared_metadata(connection.protocol_version)
+                    or connection.features.use_metadata_id
+                )
+                message.skip_meta = use_metadata_id
+                message.result_metadata_id = self.prepared_statement.result_metadata_id if use_metadata_id else None


High: this gate enables skip_meta for any prepared statement with a non-None result_metadata_id, but some prepared statements still legitimately have no cached result metadata. The repo already has that case in tests/integration/standard/test_prepared_statements.py (_test_updated_conditional asserts prepared_statement.result_metadata is None while result_metadata_id stays set for prepared conditional/LWT statements).

With SCYLLA_USE_METADATA_ID negotiated, this branch will set skip_meta=True and send that metadata id anyway. On Scylla, if the request/response metadata ids match, the server keeps NO_METADATA on the EXECUTE response instead of forcing metadata back, so the driver reaches recv_results_rows() with neither response metadata nor cached metadata to decode against. That turns into a real decode failure, not just a missed optimization.

I think this needs one more safety condition: only enable skip_meta when the prepared statement has usable cached result metadata, and keep it disabled for statements prepared with NO_METADATA / empty result metadata.

Fixed. Added has_result_metadata = bool(self.prepared_statement.result_metadata) as an additional condition in the use_metadata_id gate — skip_meta is now only enabled when the prepared statement has both a result_metadata_id and usable cached result metadata. LWT/conditional statements (INSERT ... IF NOT EXISTS etc.) have result_metadata_id set but result_metadata = None (the PREPARE response carries NO_METADATA for the result columns), so they correctly fall through to skip_meta=False and the server always sends full metadata.

On the test side: added test_query_no_skip_meta_when_result_metadata_is_none to directly cover this case, and corrected two existing _query tests (test_query_sets_skip_meta_with_scylla_extension, test_query_sets_skip_meta_for_protocol_v5) that were using an empty list [] for result_metadata — those were accidentally falsy and would have hidden this regression going forward.

Scylla's SCYLLA_USE_METADATA_ID protocol extension (backport of CQL v5 prepared-statement metadata IDs to earlier protocol versions) allows the driver to skip sending full result metadata on every EXECUTE request. The server notifies the driver via the METADATA_CHANGED flag whenever the result schema changes, at which point the driver updates its cached metadata before deserialising the response. Changes: - protocol_features.py: parse SCYLLA_USE_METADATA_ID from SUPPORTED and include it in the STARTUP frame when negotiated - protocol.py: * fix _write_query_params to actually write _SKIP_METADATA_FLAG on the wire (it was stored on the message but never sent — dead code before) * recv_results_prepared: read result_metadata_id for Scylla extension (pre-v5) in addition to standard protocol v5+ * ExecuteMessage.send_body: send result_metadata_id for Scylla extension (pre-v5) when it is set - cluster.py: * ExecuteMessage is built with safe defaults (skip_meta=False, result_metadata_id=None); both are set in _query() after borrowing the connection, gated on connection.features.use_metadata_id and on the prepared statement actually having a result_metadata_id (so a statement prepared before the extension was available, or on a node that doesn't support it, never gets skip_meta=True with no id) * _set_result: update prepared_statement.result_metadata and result_metadata_id when the server signals METADATA_CHANGED in an EXECUTE response, keeping the driver's cached metadata in sync; uses getattr to safely handle FastResultMessage (Cython decoder)

…DATA_ID - Add unit tests for the _METADATA_ID_FLAG path in recv_results_metadata (ROWS result with METADATA_CHANGED signal) - Add unit tests for _set_result metadata cache update on METADATA_CHANGED: update both result_metadata and result_metadata_id, no-op when id absent, warning when id present but column_metadata empty - Add unit tests for _query per-connection feature gating: skip_meta and result_metadata_id are set only when the connection negotiated SCYLLA_USE_METADATA_ID (or protocol v5) and the prepared statement carries a result_metadata_id - Add defensive log.warning in _set_result when server sends a new result_metadata_id without column_metadata (protocol violation) - Add write-order comment explaining thread-safety rationale for the two assignments to prepared_statement.result_metadata / result_metadata_id - Add SCYLLA_USE_METADATA_ID section to docs/scylla-specific.rst

nikagra requested review from Copilot and dkropachev March 26, 2026 19:08

Copilot started reviewing on behalf of nikagra March 26, 2026 19:08 View session

Copilot AI reviewed Mar 26, 2026

View reviewed changes

nikagra requested a review from Copilot March 27, 2026 12:03

Copilot started reviewing on behalf of nikagra March 27, 2026 12:03 View session

Copilot AI reviewed Mar 27, 2026

View reviewed changes

Comment thread cassandra/cluster.py Outdated

nikagra force-pushed the driver-153-scylla-use-metadata-id branch from ade35d8 to f42e225 Compare March 27, 2026 12:32

andrzej-jackowski-scylladb mentioned this pull request Apr 2, 2026

Add support for SCYLLA_USE_METADATA_ID and skip metadata #457

Closed

8 tasks

nikagra requested a review from sylwiaszunejko April 9, 2026 11:12

nikagra marked this pull request as ready for review April 9, 2026 21:30

dkropachev requested changes Apr 14, 2026

View reviewed changes

nikagra force-pushed the driver-153-scylla-use-metadata-id branch from 6eea397 to a86fd53 Compare April 15, 2026 09:09

nikagra requested a review from dkropachev April 15, 2026 09:12

nikagra force-pushed the driver-153-scylla-use-metadata-id branch from 7ba5835 to a86fd53 Compare April 15, 2026 11:12

nikagra mentioned this pull request Apr 15, 2026

CI: fix id-token permission for "Test wheels building" #820

Merged

nikagra added 2 commits April 22, 2026 14:34

nikagra force-pushed the driver-153-scylla-use-metadata-id branch from a86fd53 to 8880f03 Compare April 22, 2026 12:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DRIVER-153: negotiate and implement SCYLLA_USE_METADATA_ID extension#770

DRIVER-153: negotiate and implement SCYLLA_USE_METADATA_ID extension#770
nikagra wants to merge 2 commits intoscylladb:masterfrom
nikagra:driver-153-scylla-use-metadata-id

nikagra commented Mar 26, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI Mar 26, 2026

Uh oh!

nikagra Mar 27, 2026

Uh oh!

Lorak-mmk Apr 16, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

mykaul commented Mar 29, 2026

Uh oh!

nikagra commented Mar 30, 2026

Uh oh!

dkropachev left a comment

Uh oh!

dkropachev Apr 14, 2026

Uh oh!

nikagra Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

		elif self.result_metadata_id is not None:
		write_string(f, self.result_metadata_id)

Conversation

nikagra commented Mar 26, 2026

Summary

Changes

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

nikagra Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

Lorak-mmk Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

mykaul commented Mar 29, 2026

Uh oh!

nikagra commented Mar 30, 2026

Uh oh!

dkropachev left a comment

Choose a reason for hiding this comment

Uh oh!

dkropachev Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

nikagra Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants