removals: engines from daal4py by Alexandr-Solovev · Pull Request #3026 · uxlfoundation/scikit-learn-intelex

Alexandr-Solovev · 2026-03-16T12:32:38Z

Description

Checklist:

Completeness and readability

I have commented my code, particularly in hard-to-understand areas.
I have updated the documentation to reflect the changes or created a separate PR with updates and provided its number in the description, if necessary.
Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
I have resolved any merge conflicts that might occur with the base branch.

Testing

I have run it locally and tested the changes extensively.
All CI jobs are green or I have provided justification why they aren't.
I have extended testing suite if new functionality was introduced in this PR.

Performance

I have measured performance for affected algorithms using scikit-learn_bench and provided at least a summary table with measured data, if performance change is expected.
I have provided justification why performance and/or quality metrics have changed or why changes are not expected.
I have extended the benchmarking suite and provided a corresponding scikit-learn_bench PR if new measurable functionality was introduced in this PR.

david-cortes-intel · 2026-03-16T14:53:03Z

Please add them to the list of exceptions to parse from the generator like in previous PRs.

david-cortes-intel · 2026-03-16T15:49:57Z

You might need to keep the enums:

Error compiling Cython file:
------------------------------------------------------------
...
        std_string& fptype,
                                                                                         std_string& method,
                                                                                         size_t nFactors,
                                                                                         size_t fullNUsers,
                                                                                         size_t seed,
                                                                                         engines_EnginePtr* engine,
                                                                                         ^
------------------------------------------------------------
build/daal4py_cy.pyx:14647:89: 'engines_EnginePtr' is not a type identifier

Error compiling Cython file:
------------------------------------------------------------
...
                  fptype = "double",
                  method = "defaultDense",
                  size_t nFactors = -1,
                  size_t fullNUsers = -1,
                  size_t seed = -1,
                  engines_engine engine = None,
                  ^
------------------------------------------------------------
build/daal4py_cy.pyx:14671:18: 'engines_engine' is not a type identifier

codecov · 2026-03-24T23:26:50Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

Flag	Coverage Δ
azure	`79.63% <ø> (ø)`
github	`?`

Flags with carried forward coverage won't be shown. Click here to find out more.
see 2 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Alexandr-Solovev · 2026-04-07T13:26:28Z

@david-cortes-intel can you please take a look one more time. Looks like I fixed the building issue, but not sure about testing issues

Copilot

Pull request overview

This PR updates the daal4py wrapper generator, scikit-learn integration code, examples, and docs to accommodate oneDAL >= 2026, where RNG “engines” are no longer part of the public API.

Changes:

Update wrapper generation to avoid wrapping engine namespaces/types and to drop engine / EnginePtr parameters for oneDAL >= 2026.
Update sklearn-facing algorithms (GBTs, RandomForest, KMeans init) and daal4py examples to only pass engine=... for oneDAL < 2026.
Remove RNG engine API documentation from doc/sources/daal4py.rst.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
generator/wrappers.py	Skips wrapping specific engine namespaces for oneDAL >= 2026.
generator/gen_daal4py.py	Drops `engine` params and removes engine-related interfaces; skips `EnginePtr` types for oneDAL >= 2026.
examples/daal4py/decision_forest_regression_hist.py	Conditionally passes `engine` only for oneDAL < 2026.
examples/daal4py/decision_forest_regression_default_dense.py	Conditionally passes `engine` only for oneDAL < 2026.
examples/daal4py/decision_forest_classification_hist.py	Conditionally passes `engine` only for oneDAL < 2026.
examples/daal4py/decision_forest_classification_default_dense.py	Conditionally passes `engine` only for oneDAL < 2026.
doc/sources/daal4py.rst	Removes RNG engine API section from docs.
daal4py/sklearn/ensemble/GBTDAAL.py	Conditionally passes engine only for oneDAL < 2026 in GBT training.
daal4py/sklearn/ensemble/_forest.py	Conditionally passes engine only for oneDAL < 2026 in RF training.
daal4py/sklearn/cluster/k_means.py	Conditionally passes engine only for oneDAL < 2026 in kmeans init paths.

Copilot · 2026-04-07T13:33:27Z

daal4py/sklearn/cluster/k_means.py

+        if not _engines_removed:
+            _seed = random_state.randint(np.iinfo("i").max)
+            daal_engine = daal4py.engines_mt19937(
+                fptype=X_fptype, method="defaultDense", seed=_seed
+            )


When oneDAL >= 2026, _engines_removed is true and this branch skips creating/setting a seeded RNG engine. That means random_state no longer controls the k-means++ (and similarly the random) initialization, so results may become non-reproducible even with a fixed random_state, which can violate scikit-learn estimator expectations. Consider implementing a random_state-driven initialization in Python (e.g., sklearn-style k-means++/random init) or falling back to scikit-learn for initialization when engines are unavailable; at minimum emit a warning that random_state cannot be honored for oneDAL >= 2026.

Copilot · 2026-04-07T13:33:27Z

daal4py/sklearn/ensemble/_forest.py

        }
+        if not daal_check_version((2026, "P", 0)):
+            daal_engine = daal4py.engines_mt19937(seed=seed_, fptype=getFPType(X))
+            parameters["engine"] = daal_engine


For oneDAL >= 2026 this code stops passing a seeded engine into decision forest training, so random_state/seed_ no longer influences the underlying RNG. That can make RandomForestClassifier(random_state=...) non-deterministic and may cause scikit-learn compatibility checks to fail. If oneDAL provides an alternative seed parameter in 2026+, use it here; otherwise consider falling back to the scikit-learn implementation or warning that determinism is not guaranteed.

Suggested change

parameters["engine"] = daal_engine

parameters["engine"] = daal_engine

elif seed_ is not None:

warnings.warn(

"oneDAL 2026+ does not accept the seeded engine parameter used by "

"this code path, so `random_state` may not fully determine the "

"underlying RNG and results may be non-deterministic.",

RuntimeWarning,

)

Copilot · 2026-04-07T13:33:28Z

daal4py/sklearn/ensemble/_forest.py

+        if not daal_check_version((2026, "P", 0)):
+            daal_engine = daal4py.engines_mt19937(seed=seed_, fptype=getFPType(X))
+            parameters["engine"] = daal_engine


For oneDAL >= 2026 this code stops passing a seeded engine into decision forest training, so random_state/seed_ no longer influences the underlying RNG. That can make RandomForestRegressor(random_state=...) non-deterministic and may cause scikit-learn compatibility checks to fail. If oneDAL provides an alternative seed parameter in 2026+, use it here; otherwise consider falling back to the scikit-learn implementation or warning that determinism is not guaranteed.

Copilot · 2026-04-07T13:33:28Z

daal4py/sklearn/ensemble/GBTDAAL.py

+        if not daal_check_version((2026, "P", 0)):
+            train_params["engine"] = d4p.engines_mcg59(seed=seed_)
+        train_algo = d4p.gbt_classification_training(**train_params)


For oneDAL >= 2026, this skips setting a seeded engine, so random_state/seed_ no longer controls training randomness. That can make GBTDAALClassifier(random_state=...) non-deterministic and may break scikit-learn estimator checks. If oneDAL 2026+ has a replacement seed parameter, pass seed_ via that; otherwise consider falling back to scikit-learn (or warning) when determinism is required.

Copilot · 2026-04-07T13:33:28Z

daal4py/sklearn/ensemble/GBTDAAL.py

+        if not daal_check_version((2026, "P", 0)):
+            train_params["engine"] = d4p.engines_mcg59(seed=seed_)
+        train_algo = d4p.gbt_regression_training(**train_params)


For oneDAL >= 2026, this skips setting a seeded engine, so random_state/seed_ no longer controls training randomness. That can make GBTDAALRegressor(random_state=...) non-deterministic and may break scikit-learn estimator checks. If oneDAL 2026+ has a replacement seed parameter, pass seed_ via that; otherwise consider falling back to scikit-learn (or warning) when determinism is required.

david-cortes-intel · 2026-04-07T14:30:51Z

@Alexandr-Solovev The CI error:

n_estimators = 8000

    @pytest.mark.parametrize("n_estimators", N_ESTIMATORS_IRIS)
    def test_classifier_big_estimators_iris(n_estimators):
>       _compare_with_sklearn_classifier_iris(
            n_estimators=n_estimators, description=f"Classifier n_estimators={n_estimators}: "
        )

../venv/lib/python3.13/site-packages/daal4py/sklearn/ensemble/tests/test_decision_forest.py:156: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../venv/lib/python3.13/site-packages/daal4py/sklearn/ensemble/tests/test_decision_forest.py:58: in _compare_with_sklearn_classifier_iris
    daal4py_model.fit(x_train, y_train, sample_weight=sample_weight)
../venv/lib/python3.13/site-packages/daal4py/sklearn/_n_jobs_support.py:113: in n_jobs_wrapper
    return method(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../venv/lib/python3.13/site-packages/daal4py/sklearn/ensemble/_forest.py:468: in fit
    self._daal_fit_classifier(X, y, sample_weight=sample_weight)
../venv/lib/python3.13/site-packages/daal4py/sklearn/ensemble/_forest.py:777: in _daal_fit_classifier
    dfc_trainingResult = dfc_algorithm.compute(X, y, sample_weight)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
build/daal4py_cy.pyx:8898: in daal4py._daal4py.decision_forest_classification_training.compute
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   RuntimeError: Engine not supported

build/daal4py_cy.pyx:8877: RuntimeError

It is calling this function:
https://uxlfoundation.github.io/scikit-learn-intelex/2025.11/daal4py.html#daal4py.decision_forest_classification_training.compute

The object is being created with default argument for engine (i.e. engine=None), and that is erroring out.

I believe the arguments it takes and their defaults are generated from the oneDAL headers.

Were the relevant changes for this already merged to oneDAL?

Alexandr-Solovev · 2026-04-07T14:32:55Z

@Alexandr-Solovev The CI error:

n_estimators = 8000

    @pytest.mark.parametrize("n_estimators", N_ESTIMATORS_IRIS)
    def test_classifier_big_estimators_iris(n_estimators):
>       _compare_with_sklearn_classifier_iris(
            n_estimators=n_estimators, description=f"Classifier n_estimators={n_estimators}: "
        )

../venv/lib/python3.13/site-packages/daal4py/sklearn/ensemble/tests/test_decision_forest.py:156: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
../venv/lib/python3.13/site-packages/daal4py/sklearn/ensemble/tests/test_decision_forest.py:58: in _compare_with_sklearn_classifier_iris
    daal4py_model.fit(x_train, y_train, sample_weight=sample_weight)
../venv/lib/python3.13/site-packages/daal4py/sklearn/_n_jobs_support.py:113: in n_jobs_wrapper
    return method(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
../venv/lib/python3.13/site-packages/daal4py/sklearn/ensemble/_forest.py:468: in fit
    self._daal_fit_classifier(X, y, sample_weight=sample_weight)
../venv/lib/python3.13/site-packages/daal4py/sklearn/ensemble/_forest.py:777: in _daal_fit_classifier
    dfc_trainingResult = dfc_algorithm.compute(X, y, sample_weight)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
build/daal4py_cy.pyx:8898: in daal4py._daal4py.decision_forest_classification_training.compute
    ???
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

>   ???
E   RuntimeError: Engine not supported

build/daal4py_cy.pyx:8877: RuntimeError

It is calling this function: https://uxlfoundation.github.io/scikit-learn-intelex/2025.11/daal4py.html#daal4py.decision_forest_classification_training.compute

The object is being created with default argument for engine (i.e. engine=None), and that is erroring out.

I believe the arguments it takes and their defaults are generated from the oneDAL headers.

Were the relevant changes for this already merged to oneDAL?

Not yet, the changes are here uxlfoundation/oneDAL#3586

david-cortes-intel · 2026-04-07T14:50:35Z

@Alexandr-Solovev I get the same error when running it locally if I try the branch from your fork of oneDAL.

david-cortes-intel · 2026-04-07T14:55:53Z

Note that the init signature doesn't explicitly include engine:

Init signature: daal4py.decision_forest_classification_training(self, /, *args, **kwargs)

And the parameters do not pass it either:

{'bootstrap': True, 'featuresPerNode': 2, 'fptype': 'double', 'impurityThreshold': 0.0, 'maxBins': 256, 'maxLeafNodes': 0, 'maxTreeDepth': 0, 'memorySavingMode': False, 'method': 'hist', 'minBinSize': 1, 'minImpurityDecreaseInSplitNode': 0.0, 'minWeightFractionInLeafNode': 0.0, 'nClasses': 3, 'nTrees': 8000, 'observationsPerTreeFraction': 1.0, 'resultsToCompute': '', 'varImportance': 'MDI', 'minObservationsInSplitNode': 2, 'minObservationsInLeafNode': 1, 'binningStrategy': 'quantiles'}

init commit for removals

33c4054

fixes

b444171

ethanglaser added 3 commits March 24, 2026 14:43

Add version checks

2ffe0f1

formatting and gen_daal4py ignores

3b74616

pull build from onedal branch for github actions

5da11d6

This comment was marked as spam.

Sign in to view

Alexandr-Solovev added 5 commits April 7, 2026 09:28

Merge branch 'uxlfoundation:main' into dev/asolovev_removals_engines

4ff2635

fixes

170265f

fixes

48987ea

fixes

b86a948

fixes

161852d

Alexandr-Solovev marked this pull request as ready for review April 7, 2026 13:26

Alexandr-Solovev requested review from david-cortes-intel, icfaust, maria-Petrova, napetrov and syakov-intel as code owners April 7, 2026 13:26

Copilot AI review requested due to automatic review settings April 7, 2026 13:26

Alexandr-Solovev requested review from Vika-F and yuejiaointel as code owners April 7, 2026 13:26

Copilot started reviewing on behalf of Alexandr-Solovev April 7, 2026 13:26 View session

Copilot AI reviewed Apr 7, 2026

View reviewed changes

fixes

585b4f9

Alexandr-Solovev requested review from ahuber21 and homksei as code owners April 7, 2026 14:30

Alexandr-Solovev requested a review from ethanglaser as a code owner April 7, 2026 14:30

-            parameters["engine"] = daal_engine
+            parameters["engine"] = daal_engine
+        elif seed_ is not None:
+            warnings.warn(
+                "oneDAL 2026+ does not accept the seeded engine parameter used by "
+                "this code path, so `random_state` may not fully determine the "
+                "underlying RNG and results may be non-deterministic.",
+                RuntimeWarning,
+            )

Conversation

Alexandr-Solovev commented Mar 16, 2026

Description

Uh oh!

david-cortes-intel commented Mar 16, 2026

Uh oh!

david-cortes-intel commented Mar 16, 2026

Uh oh!

codecov bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

This comment was marked as spam.

Uh oh!

This comment was marked as spam.

Uh oh!

Alexandr-Solovev commented Apr 7, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

david-cortes-intel commented Apr 7, 2026

Uh oh!

Alexandr-Solovev commented Apr 7, 2026

Uh oh!

david-cortes-intel commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

david-cortes-intel commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

codecov bot commented Mar 24, 2026 •

edited

Loading

david-cortes-intel commented Apr 7, 2026 •

edited

Loading