Add doc page about `n_jobs` #2768

olegkkruglov · 2025-11-10T17:05:58Z

Description

Add doc update regarding n_jobs parameter. Contains content from #2453 and addresses comments from there.

Checklist:

Completeness and readability

I have commented my code, particularly in hard-to-understand areas.
I have updated the documentation to reflect the changes or created a separate PR with updates and provided its number in the description, if necessary.
Git commit message contains an appropriate signed-off-by string (see CONTRIBUTING.md for details).
I have resolved any merge conflicts that might occur with the base branch.

Testing

I have run it locally and tested the changes extensively.
All CI jobs are green or I have provided justification why they aren't.
I have extended testing suite if new functionality was introduced in this PR.

codecov · 2025-11-10T17:44:03Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

Flag	Coverage Δ
azure	`?`
github	`82.10% <ø> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.
see 31 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

doc/sources/parallelism.rst

david-cortes-intel · 2025-11-17T09:53:36Z

doc/sources/parallelism.rst

+`the calculation of the 'n_jobs' parameter value <https://scikit-learn.org/stable/glossary.html#term-n_jobs>`__.
+
+When Scikit-learn's utilities with built-in parallelism are used (for example, `GridSearchCV` or `VotingClassifier`),
+|sklearnex| tries to determine the optimal number of threads per job using hints provided by `joblib`.


Could you point to the code where this happens? How does it detect that it is running under joblib? (they have multiple threading backends).

scikit-learn-intelex/daal4py/sklearn/_n_jobs_support.py

Line 42 in 5c353ba

def get_suggested_n_threads(n_cpus):

If several instances of sklearnex are run via joblib, n_threads is equal to number of cpu / number of instances

Thanks. I played a bit with it, and from what I can tell, it works under both joblib jobs and threadpool contexts. Including contexts that only limit BLAS like this:

threadpoolctl.threadpool_limits(limits=2, user_api='blas')

.. which actually contradicts some of the other points here:

|sklearnex| threading doesn't automatically avoid nested parallelism when used in conjunction with OpenMP and/or with joblib or python threads.

doc/sources/parallelism.rst

david-cortes-intel · 2025-11-17T10:36:01Z

Thanks for looking into it. A couple points from the earlier PR:

It's not clear to me what happens if one would try to control BLAS/MKL threads through threadpoolctl independently of sklearn settings. I would guess it'd have no effect but this would be important to document, since it also differs from sklearn.
In the same vein, I guess using mkl_service also wouldn't have any effect on the static-linked MKL used by oneDAL.
It's missing some settings that are global, like the daal4py threads. I think currently T-SNE is the only algorithm whose code takes number of threads from different sources in oneDAL, but not sure if there's any effect there.
- In this regard, it could mention also what would happen if using sklearnex estimators in python threads. I think passing n_jobs ends up modifying global settings regardless, which means there'd be issues if passing different n_jobs from different python threads (perhaps @Vika-F might have some insights on what would happen).
Since TBB works differently from joblib, it could mention here what happens if executing sequential calls to estimators with different numbers of threads. I think currently there is some logic when first passing a large number of threads and then a smaller one that the initial process-wide thread pool is not re-created and can have an impact on performance, but perhaps @avolkov-intel could comment.
- And it could also mention that the first call to something multi-threaded will need to set up the process-wide thread pool, which adds some overhead to the first call of whatever runs multi-threaded (since this is also different from how joblib parallelization works).

doc/sources/parallelism.rst

david-cortes-intel · 2025-11-18T08:06:22Z

doc/sources/parallelism.rst

+
+* `n_jobs` parameter is supported for all estimators patched by |sklearnex|,
+  while |sklearn| enables it for selected estimators only.
+* `n_jobs` estimator parameter sets the number of threads used by the underlying |oneDAL|.


Suggested change

* `n_jobs` estimator parameter sets the number of threads used by the underlying |oneDAL|.

* `n_jobs` estimator parameter sets the number of threads used by the underlying |onedal|.

Macros are case-sensitive.

david-cortes-intel · 2025-11-18T08:06:47Z

doc/sources/parallelism.rst

+* If `n_jobs` is not specified |sklearnex| uses all available threads whereas |sklearn| is single-threaded by default.
+
+|sklearnex| follows the same rules as |sklearn| for
+`the calculation of the :term:`n_jobs` parameter value.


Suggested change

`the calculation of the :term:`n_jobs` parameter value.

the calculation of the :term:`n_jobs` parameter value.

david-cortes-intel · 2025-11-18T08:07:06Z

doc/sources/parallelism.rst

+|sklearnex| follows the same rules as |sklearn| for
+`the calculation of the :term:`n_jobs` parameter value.
+
+When Scikit-learn's utilities with built-in parallelism are used 


Suggested change

When Scikit-learn's utilities with built-in parallelism are used

When |sklearn|'s utilities with built-in parallelism are used

david-cortes-intel · 2025-11-18T08:16:14Z

doc/sources/parallelism.rst

+    |sklearnex| threading doesn't automatically avoid nested parallelism when used in conjunction with OpenMP and/or with joblib or python threads.
+
+To track the actual number of threads used by estimators from the |sklearnex|,
+set the `DEBUG` :ref:`verbosity setting <verbose>`.


I do not see any log with the number of threads when doing this.

Example:

import os os.environ["SKLEARNEX_VERBOSE"] = "DEBUG" import numpy as np from sklearnex.linear_model import Ridge from sklearn.model_selection import GridSearchCV rng = np.random.default_rng(seed=123) X = rng.standard_normal(size=(100,10)) y = rng.standard_normal(X.shape[0]) Ridge().fit(X, y)

DEBUG:sklearnex: Assigned method '<host_backend>.linear_model.regression.train' to 'BaseLinearRegression.train' DEBUG:sklearnex: Assigned method '<host_backend>.linear_model.regression.infer' to 'BaseLinearRegression.infer' DEBUG:sklearnex: Assigned method '<host_backend>.linear_model.regression.model' to 'BaseLinearRegression.model' DEBUG:sklearnex: Assigned method '<host_backend>.linear_model.regression.partial_train_result' to 'BaseIncrementalLinear.partial_train_result' DEBUG:sklearnex: Assigned method '<host_backend>.linear_model.regression.partial_train' to 'BaseIncrementalLinear.partial_train' DEBUG:sklearnex: Assigned method '<host_backend>.linear_model.regression.finalize_train' to 'BaseIncrementalLinear.finalize_train' DEBUG:sklearnex: Assigned method '<host_backend>.logistic_regression.classification.train' to 'LogisticRegression.train' DEBUG:sklearnex: Assigned method '<host_backend>.logistic_regression.classification.infer' to 'LogisticRegression.infer' DEBUG:sklearnex: Assigned method '<host_backend>.logistic_regression.classification.model' to 'LogisticRegression.model' INFO:sklearnex: sklearn.linear_model.Ridge.fit: running accelerated version on CPU DEBUG:sklearnex: Dispatching function 'linear_model.regression.train' with policy <onedal._onedal_py_host.host_policy object at 0x7fb689e02770> to Backend(<module 'onedal._onedal_py_host' from '/home/dcortes/repos/scikit-learn-intelex/onedal/_onedal_py_host.cpython-311-x86_64-linux-gnu.so'>, is_dpc=False, is_spmd=False)

david-cortes-intel · 2025-11-18T08:47:34Z

doc/sources/parallelism.rst

+
+When Scikit-learn's utilities with built-in parallelism are used 
+(for example, :obj:`sklearn.model_selection.GridSearchCV` or :obj:`sklearn.model_selection.VotingClassifier`),
+|sklearnex| tries to determine the optimal number of threads per job using hints proded by `joblib`.


Suggested change

|sklearnex| tries to determine the optimal number of threads per job using hints proded by `joblib`.

|sklearnex| tries to determine the optimal number of threads per job using hints proded by :mod:`joblib` / ``threadpoolctl``.

Seems to work with both.

Alexsandruss and others added 3 commits November 10, 2025 09:02

n_jobs support details in docs

245ecc1

Fixes for doc page

cc6fd1b

Address comments

50e2b8a

olegkkruglov added the documentation label Nov 10, 2025

Fix indent

6957b9e

david-cortes-intel reviewed Nov 17, 2025

View reviewed changes

doc/sources/parallelism.rst Outdated Show resolved Hide resolved

Address some comments

6737b5c

david-cortes-intel reviewed Nov 17, 2025

View reviewed changes

doc/sources/parallelism.rst Outdated Show resolved Hide resolved

doc/sources/parallelism.rst Outdated Show resolved Hide resolved

doc/sources/parallelism.rst Outdated Show resolved Hide resolved

doc/sources/parallelism.rst Outdated Show resolved Hide resolved

olegkkruglov added 2 commits November 17, 2025 06:25

Describe negative n-jobs behavior

a2a0f17

Address comments

2ea50c3

david-cortes-intel reviewed Nov 18, 2025

View reviewed changes

david-cortes-intel mentioned this pull request Nov 18, 2025

n_jobs support details in docs #2453

Closed

9 tasks

	* `n_jobs` estimator parameter sets the number of threads used by the underlying \|oneDAL\|.
	* `n_jobs` estimator parameter sets the number of threads used by the underlying \|onedal\|.

	`the calculation of the :term:`n_jobs` parameter value.
	the calculation of the :term:`n_jobs` parameter value.

	When Scikit-learn's utilities with built-in parallelism are used
	When \|sklearn\|'s utilities with built-in parallelism are used

	\|sklearnex\| tries to determine the optimal number of threads per job using hints proded by `joblib`.
	\|sklearnex\| tries to determine the optimal number of threads per job using hints proded by :mod:`joblib` / ``threadpoolctl``.

Add doc page about n_jobs #2768

Are you sure you want to change the base?

Add doc page about n_jobs #2768

Uh oh!

Conversation

olegkkruglov commented Nov 10, 2025

Description

Uh oh!

codecov bot commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

david-cortes-intel commented Nov 17, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add doc page about `n_jobs` #2768

Add doc page about `n_jobs` #2768

codecov bot commented Nov 10, 2025 •

edited

Loading