removals: engines from daal4py#3026
removals: engines from daal4py#3026Alexandr-Solovev wants to merge 11 commits intouxlfoundation:mainfrom
Conversation
|
Please add them to the list of exceptions to parse from the generator like in previous PRs. |
|
You might need to keep the enums: |
Codecov Report✅ All modified and coverable lines are covered by tests.
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
|
@david-cortes-intel can you please take a look one more time. Looks like I fixed the building issue, but not sure about testing issues |
There was a problem hiding this comment.
Pull request overview
This PR updates the daal4py wrapper generator, scikit-learn integration code, examples, and docs to accommodate oneDAL >= 2026, where RNG “engines” are no longer part of the public API.
Changes:
- Update wrapper generation to avoid wrapping engine namespaces/types and to drop
engine/EnginePtrparameters for oneDAL >= 2026. - Update sklearn-facing algorithms (GBTs, RandomForest, KMeans init) and daal4py examples to only pass
engine=...for oneDAL < 2026. - Remove RNG engine API documentation from
doc/sources/daal4py.rst.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| generator/wrappers.py | Skips wrapping specific engine namespaces for oneDAL >= 2026. |
| generator/gen_daal4py.py | Drops engine params and removes engine-related interfaces; skips EnginePtr types for oneDAL >= 2026. |
| examples/daal4py/decision_forest_regression_hist.py | Conditionally passes engine only for oneDAL < 2026. |
| examples/daal4py/decision_forest_regression_default_dense.py | Conditionally passes engine only for oneDAL < 2026. |
| examples/daal4py/decision_forest_classification_hist.py | Conditionally passes engine only for oneDAL < 2026. |
| examples/daal4py/decision_forest_classification_default_dense.py | Conditionally passes engine only for oneDAL < 2026. |
| doc/sources/daal4py.rst | Removes RNG engine API section from docs. |
| daal4py/sklearn/ensemble/GBTDAAL.py | Conditionally passes engine only for oneDAL < 2026 in GBT training. |
| daal4py/sklearn/ensemble/_forest.py | Conditionally passes engine only for oneDAL < 2026 in RF training. |
| daal4py/sklearn/cluster/k_means.py | Conditionally passes engine only for oneDAL < 2026 in kmeans init paths. |
| if not _engines_removed: | ||
| _seed = random_state.randint(np.iinfo("i").max) | ||
| daal_engine = daal4py.engines_mt19937( | ||
| fptype=X_fptype, method="defaultDense", seed=_seed | ||
| ) |
There was a problem hiding this comment.
When oneDAL >= 2026, _engines_removed is true and this branch skips creating/setting a seeded RNG engine. That means random_state no longer controls the k-means++ (and similarly the random) initialization, so results may become non-reproducible even with a fixed random_state, which can violate scikit-learn estimator expectations. Consider implementing a random_state-driven initialization in Python (e.g., sklearn-style k-means++/random init) or falling back to scikit-learn for initialization when engines are unavailable; at minimum emit a warning that random_state cannot be honored for oneDAL >= 2026.
| } | ||
| if not daal_check_version((2026, "P", 0)): | ||
| daal_engine = daal4py.engines_mt19937(seed=seed_, fptype=getFPType(X)) | ||
| parameters["engine"] = daal_engine |
There was a problem hiding this comment.
For oneDAL >= 2026 this code stops passing a seeded engine into decision forest training, so random_state/seed_ no longer influences the underlying RNG. That can make RandomForestClassifier(random_state=...) non-deterministic and may cause scikit-learn compatibility checks to fail. If oneDAL provides an alternative seed parameter in 2026+, use it here; otherwise consider falling back to the scikit-learn implementation or warning that determinism is not guaranteed.
| parameters["engine"] = daal_engine | |
| parameters["engine"] = daal_engine | |
| elif seed_ is not None: | |
| warnings.warn( | |
| "oneDAL 2026+ does not accept the seeded engine parameter used by " | |
| "this code path, so `random_state` may not fully determine the " | |
| "underlying RNG and results may be non-deterministic.", | |
| RuntimeWarning, | |
| ) |
| if not daal_check_version((2026, "P", 0)): | ||
| daal_engine = daal4py.engines_mt19937(seed=seed_, fptype=getFPType(X)) | ||
| parameters["engine"] = daal_engine |
There was a problem hiding this comment.
For oneDAL >= 2026 this code stops passing a seeded engine into decision forest training, so random_state/seed_ no longer influences the underlying RNG. That can make RandomForestRegressor(random_state=...) non-deterministic and may cause scikit-learn compatibility checks to fail. If oneDAL provides an alternative seed parameter in 2026+, use it here; otherwise consider falling back to the scikit-learn implementation or warning that determinism is not guaranteed.
| if not daal_check_version((2026, "P", 0)): | ||
| train_params["engine"] = d4p.engines_mcg59(seed=seed_) | ||
| train_algo = d4p.gbt_classification_training(**train_params) |
There was a problem hiding this comment.
For oneDAL >= 2026, this skips setting a seeded engine, so random_state/seed_ no longer controls training randomness. That can make GBTDAALClassifier(random_state=...) non-deterministic and may break scikit-learn estimator checks. If oneDAL 2026+ has a replacement seed parameter, pass seed_ via that; otherwise consider falling back to scikit-learn (or warning) when determinism is required.
| if not daal_check_version((2026, "P", 0)): | ||
| train_params["engine"] = d4p.engines_mcg59(seed=seed_) | ||
| train_algo = d4p.gbt_regression_training(**train_params) |
There was a problem hiding this comment.
For oneDAL >= 2026, this skips setting a seeded engine, so random_state/seed_ no longer controls training randomness. That can make GBTDAALRegressor(random_state=...) non-deterministic and may break scikit-learn estimator checks. If oneDAL 2026+ has a replacement seed parameter, pass seed_ via that; otherwise consider falling back to scikit-learn (or warning) when determinism is required.
|
@Alexandr-Solovev The CI error: It is calling this function: The object is being created with default argument for I believe the arguments it takes and their defaults are generated from the oneDAL headers. Were the relevant changes for this already merged to oneDAL? |
Not yet, the changes are here uxlfoundation/oneDAL#3586 |
|
@Alexandr-Solovev I get the same error when running it locally if I try the branch from your fork of oneDAL. |
|
Note that the init signature doesn't explicitly include engine: And the parameters do not pass it either: |
Description
Checklist:
Completeness and readability
Testing
Performance