Skip to content

Remove num dim usages in xnnpack flatbuffer#19013

Merged
lucylq merged 1 commit intomainfrom
lfq.remove-num-dim-usages
Apr 22, 2026
Merged

Remove num dim usages in xnnpack flatbuffer#19013
lucylq merged 1 commit intomainfrom
lfq.remove-num-dim-usages

Conversation

@lucylq
Copy link
Copy Markdown
Contributor

@lucylq lucylq commented Apr 20, 2026

There's some redundancy here where num_dims is serialized into the flatbuffer. Use size from the dims array directly, as if we use num_dims, we have to check it matches up with the dims array size.

Hopefully prevent some security issues down the line.

@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Apr 20, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19013

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 9ee60b4 with merge base 2d53535 (image):

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 20, 2026
@github-actions
Copy link
Copy Markdown

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

@lucylq lucylq force-pushed the lfq.remove-num-dim-usages branch from 5e6601b to 9ee60b4 Compare April 21, 2026 20:58
tensor_value->num_dims(),
tensor_value->dims()->size());
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah this was recently added but if we use tensor_value->dims()->size() directly then we also don't need the check on tensor_value->num_dims()..

@lucylq lucylq marked this pull request as ready for review April 21, 2026 23:11
@lucylq lucylq requested a review from digantdesai as a code owner April 21, 2026 23:11
Copilot AI review requested due to automatic review settings April 21, 2026 23:11
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the XNNPACK flatbuffer runtime deserialization path to stop relying on the serialized num_dims field and instead derive tensor rank directly from the dims vector length, reducing redundancy and avoiding mismatches between num_dims and dims.

Changes:

  • Removes num_dims/dims consistency checks in defineTensor and builds rank from dims()->size().
  • Passes dims_data.size() to XNNPACK xnn_define_* APIs instead of tensor_value->num_dims().
  • Updates debug logging format to print size_t correctly.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 584 to 588
/*datatype=*/datatype,
/*zero_point=*/zero_point,
/*scale=*/scale_data,
/*num_dims=*/tensor_value->num_dims(),
/*num_dims=*/dims_data.size(),
/*channel_dim=*/qparams->channel_dim(),
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PerChannelGroupQuant path still dereferences tensor_value->dims()->Get(0/1) earlier in this case block without any null / length validation. Since the earlier num_dims/dims consistency checks were removed, a malformed (or malicious) flatbuffer with dims == nullptr (or a shorter-than-expected dims vector) can now crash here instead of returning InvalidProgram. Please add an ET_CHECK_OR_RETURN_ERROR(dims_data.size() >= 2, InvalidProgram, ...) (and use dims_data[0]/dims_data[1]) before computing output_channels/input_channels so this remains robust against corrupted inputs.

Copilot uses AI. Check for mistakes.
@lucylq
Copy link
Copy Markdown
Contributor Author

lucylq commented Apr 22, 2026

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented Apr 22, 2026

Claude finished @lucylq's task in 1m 4s —— View job


Review of PR #19013: Remove num dim usages in xnnpack flatbuffer

  • Read the full diff and changed file
  • Analyze the removed checks and new logic
  • Check for potential issues (null safety, bounds access)
  • Post review feedback

The change is clean and the motivation is sound — deriving rank from dims()->size() eliminates the redundant num_dims field and the mismatch class of bugs entirely.

Issue: Unguarded dims()->Get() in PerChannelGroupQuant

At XNNCompiler.cpp:527-528, the PerChannelGroupQuant case accesses tensor_value->dims() directly instead of using dims_data:

size_t output_channels = tensor_value->dims()->Get(0);
size_t input_channels = tensor_value->dims()->Get(1);

This has two problems:

  1. Null dereference: If dims() is nullptr (e.g. a scalar tensor or malformed flatbuffer), this crashes. Before this PR, the removed check at the top guaranteed dims != nullptr when num_dims > 0, providing some protection. Now that check is gone.

  2. Inconsistency: Every other code path in this function uses dims_data (the converted std::vector<size_t>). This path bypasses that and reads raw flatbuffer data. It should use dims_data[0] and dims_data[1] for consistency, with bounds validation.

Suggested fix — add a bounds check and use dims_data:

ET_CHECK_OR_RETURN_ERROR(
    dims_data.size() >= 2,
    InvalidProgram,
    "PerChannelGroupQuant requires at least 2 dims, got %zu",
    dims_data.size());
size_t output_channels = dims_data[0];
size_t input_channels = dims_data[1];

Fix this →

Minor nit: PerChannelQuant scale null check

At line 487, qparams->scale()->data() is also called without a null check on scale(). This is pre-existing and not introduced by this PR, so not a blocker.

Summary

The core mechanical change (replacing tensor_value->num_dims() with dims_data.size()) is correct across all call sites. The format specifier fix (%i%zu) for size_t is also correct. The one actionable item is the PerChannelGroupQuant path that still reads tensor_value->dims() directly — this should be updated to use dims_data with a bounds check to maintain robustness against malformed inputs.


| Branch: lfq.remove-num-dim-usages

@lucylq lucylq merged commit 32702ac into main Apr 22, 2026
181 of 189 checks passed
@lucylq lucylq deleted the lfq.remove-num-dim-usages branch April 22, 2026 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. security-fix

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants