[enhancement] Implement MaxAbsScaler Estimator **AI implemented** by icfaust · Pull Request #3020 · uxlfoundation/scikit-learn-intelex

icfaust · 2026-03-12T20:47:26Z

Description

Note: Here is my Implementation plan from Antigravity (this PR took an hour max):

Goal Description

The objective is to create a new sklearnex estimator, MaxAbsScaler, that duplicates the API functionality of sklearn.preprocessing.MaxAbsScaler. This estimator will be accelerated using Intel oneDAL's IncrementalBasicStatistics from the onedal backend directly. Following the design patterns of DummyRegressor, we will implement an API-compatible layer in sklearnex that uses the onedal backend natively. The new estimator will be integrated into the preview submodule.

Proposed Changes

sklearnex Layer (`sklearnex/preview/preprocessing/`)

We will implement the frontend estimator that conforms to scikit-learn's API and utilizes oneDAL's IncrementalBasicStatistics. We will place this under _data.py to match scikit-learn's internal file structures.

[NEW] `sklearnex/preview/preprocessing/init.py`

Exports MaxAbsScaler.

[NEW] `sklearnex/preview/preprocessing/_data.py`

Inherits from oneDALEstimator and sklearn.preprocessing.MaxAbsScaler.
fit / partial_fit: Overridden methods with the @control_n_jobs decorator. The transform method will NOT be overridden; we will rely purely on the standard sklearn implementation of transform.
_onedal_cpu_supported / _onedal_gpu_supported: Defines condition chains to fallback to sklearn (e.g., if input is sparse, as MaxAbsScaler supports csr_matrix natively in sklearn, but IncrementalBasicStatistics might only support dense arrays; also checking for supported numerical types float32, float64).
_onedal_fit / _onedal_partial_fit: Internal routines executing the dispatch logic. Here we will use onedal.basic_statistics.IncrementalBasicStatistics(result_options=["min", "max"]) to compute the min_ and max_ for the batch/data.
_onedal_finalize_fit: We will compute max_abs_ = np.maximum(np.abs(min_), np.abs(max_)) and scale_ accordingly using numpy/xp functionality.

SPMD Interface (`sklearnex/spmd/preprocessing/`)

We will also provide a distributed implementation via SPMD functionality relying on onedal.spmd.basic_statistics.

[NEW] `sklearnex/spmd/preprocessing/init.py`

Exports MaxAbsScaler for the SPMD interface.

[NEW] `sklearnex/spmd/preprocessing/_data.py`

Inherits from sklearnex.preview.preprocessing.MaxAbsScaler (our base preview class).
The _onedal_incremental_basic_statistics static method will be overridden to point to onedal.spmd.basic_statistics.IncrementalBasicStatistics.

Dispatcher Updates (`sklearnex/`)

[MODIFY] `sklearnex/dispatcher.py`

Add sklearn.preprocessing.MaxAbsScaler to the preview_mapping so patch_sklearn() correctly diverts execution to sklearnex.preview.preprocessing.MaxAbsScaler when preview mode is on. Ensure the proper preprocessing_module (import sklearn.preprocessing as preprocessing_module) is passed in the patch map tuple.

Verification Plan

Automated Tests

We will add sklearnex/preview/preprocessing/tests/test_data.py containing a comprehensive test suite.
Tests will include:
1. Dense data validation (fit, partial_fit, transform, inverse_transform) comparing max_abs_, scale_, and transformed output against standard sklearn.preprocessing.MaxAbsScaler.
2. Fallback checking to ensure Sparse arrays trigger a fallback to sklearn successfully.
3. Batch processing tests to ensure partial_fit handles continuous batches accurately.
4. Follow the precedent set out in the repository for other estimators.
5. Include tests for Array API dispatch execution and device-specific behavior (e.g. GPU, via SYCL queues and DPCTL/DPNP dataframes) to align with standard sklearnex estimator validation.
We will also add SPMD testing in sklearnex/spmd/preprocessing/tests/test_data_spmd.py:
1. Add tests marked with @pytest.mark.mpi checking that sklearnex.spmd implementation is entirely equivalent to local batch execution via the non-SPMD sklearnex preview module.
2. Utilize helper functions like _get_local_tensor and _convert_to_dataframe to simulate SPMD environments where data is split among ranks.

Linting

the repo has black, isort, clang-format, numpydoc-validation and codespell format/lint hooks in its .pre-commit-config.yaml.
We must make sure to run those checks on the new files once the code is implemented.

Checklist:

Completeness and readability

I have commented my code, particularly in hard-to-understand areas.
I have updated the documentation to reflect the changes or created a separate PR with updates and provided its number in the description, if necessary.
Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
I have resolved any merge conflicts that might occur with the base branch.

Testing

I have run it locally and tested the changes extensively.
All CI jobs are green or I have provided justification why they aren't.
I have extended testing suite if new functionality was introduced in this PR.

Performance

I have measured performance for affected algorithms using scikit-learn_bench and provided at least a summary table with measured data, if performance change is expected.
I have provided justification why performance and/or quality metrics have changed or why changes are not expected.
I have extended the benchmarking suite and provided a corresponding scikit-learn_bench PR if new measurable functionality was introduced in this PR.

icfaust · 2026-03-12T20:48:36Z

sklearnex/preview/preprocessing/_data.py

+# limitations under the License.
+# ==============================================================================
+
+import numpy as np


unused import here.

icfaust · 2026-03-12T21:10:43Z

sklearnex/preview/preprocessing/_data.py

+            (X,) = data
+            patching_status.and_conditions(
+                [
+                    (not is_sparse(X), "Sparse input is not supported"),


I assume benchmarking here will be necessary to find where the standard sklearn implementation is faster in finding the min and max, and then add a condition here to make sure ours is used when accelerating.

codecov · 2026-03-14T02:01:20Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

Flag	Coverage Δ
azure	`77.06% <100.00%> (-2.46%)`	⬇️
github	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
...l/basic_statistics/incremental_basic_statistics.py	`100.00% <100.00%> (ø)`
sklearnex/dispatcher.py	`86.04% <100.00%> (-5.08%)`	⬇️

... and 32 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

yuejiaointel · 2026-03-17T06:35:05Z

/intelci: run

yuejiaointel · 2026-03-18T01:04:03Z

/intelci: run

icfaust · 2026-03-19T11:37:33Z

/intelci: run

icfaust added 4 commits March 12, 2026 21:30

test with antigravity

45613ff

add missing spmd interface

3520f0b

add missing files

3af4717

add missing tests from public sklearn conformance

470fcc4

icfaust commented Mar 12, 2026

View reviewed changes

fixes for linting?

bfce75d

icfaust commented Mar 12, 2026

View reviewed changes

icfaust added 13 commits March 12, 2026 22:16

forgot to add to __init__.py

983aa48

Update __init__.py

7049aff

Update _data.py

76029c7

Update _data.py

1d328b0

Update _data.py

af819ec

Update _data.py

637a6a8

Update _data.py

c89d629

Update _data.py

cb7db16

Update _data.py

5013a7d

Update _data.py

d04223f

Update _data.py

2027437

Update _data.py

69f4de1

Update deselected_tests.yaml

97685f8

icfaust added 3 commits March 14, 2026 06:12

Update incremental_basic_statistics.py

1476b9c

Update _data.py

2238d15

Update _data.py

2a70964

icfaust marked this pull request as ready for review March 15, 2026 23:26

icfaust requested review from ahuber21, david-cortes-intel, ethanglaser, napetrov and yuejiaointel as code owners March 15, 2026 23:26

icfaust requested a review from Vika-F as a code owner March 15, 2026 23:26

Merge branch 'uxlfoundation:main' into maxabs_test

aeb0d89

icfaust changed the title [enhancement] Implement MaxAbsScalar Estimator **AI implemented** [enhancement] Implement MaxAbsScaler Estimator **AI implemented** Mar 18, 2026

Update test_data_spmd.py

0d0ca1d

Merge branch 'uxlfoundation:main' into maxabs_test

e32ff6f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[enhancement] Implement MaxAbsScaler Estimator AI implemented#3020

[enhancement] Implement MaxAbsScaler Estimator AI implemented#3020
icfaust wants to merge 24 commits intouxlfoundation:mainfrom
icfaust:maxabs_test

icfaust commented Mar 12, 2026 •

edited

Loading

Uh oh!

icfaust Mar 12, 2026

Uh oh!

icfaust Mar 12, 2026

Uh oh!

codecov bot commented Mar 14, 2026 •

edited

Loading

Uh oh!

yuejiaointel commented Mar 17, 2026

Uh oh!

yuejiaointel commented Mar 18, 2026

Uh oh!

icfaust commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

icfaust commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Goal Description

Proposed Changes

sklearnex Layer (sklearnex/preview/preprocessing/)

[NEW] sklearnex/preview/preprocessing/__init__.py

[NEW] sklearnex/preview/preprocessing/_data.py

SPMD Interface (sklearnex/spmd/preprocessing/)

[NEW] sklearnex/spmd/preprocessing/__init__.py

[NEW] sklearnex/spmd/preprocessing/_data.py

Dispatcher Updates (sklearnex/)

[MODIFY] sklearnex/dispatcher.py

Verification Plan

Automated Tests

Linting

Uh oh!

icfaust Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

icfaust Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Mar 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

yuejiaointel commented Mar 17, 2026

Uh oh!

yuejiaointel commented Mar 18, 2026

Uh oh!

icfaust commented Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

icfaust commented Mar 12, 2026 •

edited

Loading

sklearnex Layer (`sklearnex/preview/preprocessing/`)

[NEW] `sklearnex/preview/preprocessing/init.py`

[NEW] `sklearnex/preview/preprocessing/_data.py`

SPMD Interface (`sklearnex/spmd/preprocessing/`)

[NEW] `sklearnex/spmd/preprocessing/init.py`

[NEW] `sklearnex/spmd/preprocessing/_data.py`

Dispatcher Updates (`sklearnex/`)

[MODIFY] `sklearnex/dispatcher.py`

codecov bot commented Mar 14, 2026 •

edited

Loading