Skip to content
Snippets Groups Projects
Commit d23a4b11 authored by Manuel Guenther's avatar Manuel Guenther
Browse files

Simplified interface to have only two parameters

parent ddcb1ddb
No related branches found
No related tags found
No related merge requests found
Pipeline #56706 failed
......@@ -67,31 +67,27 @@ def dask_get_partition_size(cluster, n_objects, lower_bound=200):
else n_objects
)
def get_local_parallel_client(n_workers=None, processes=True, threads_per_worker=1):
def get_local_parallel_client(parallel=None, processes=True):
"""Returns a local Dask client with the given parameters, see the dask documentation for details: https://docs.dask.org/en/latest/how-to/deploy-dask/single-distributed.html?highlight=localcluster#localcluster
Parameters
----------
n_workers: int or None
The number of workers (processes) to use; if `None`, take as many processors as we have on the system
parallel: int or None
The number of workers (processes or threads) to use; if `None`, take as many processors as we have on the system
processes: boolean
Shall the dask client start processes (True, recommended) or threads (False). Note that threads in pyton do not run in parallel
threads_per_worker: int
How many threads (not processes) per worker to you allow?
Shall the dask client start processes (True, recommended) or threads (False). Note that threads in pure pyton do not run in parallel, see: https://www.quantstart.com/articles/Parallelising-Python-with-Threading-and-Multiprocessing/
"""
from multiprocessing import cpu_count
from dask.distributed import Client, LocalCluster
n_workers = n_workers or cpu_count()
parallel = parallel or cpu_count()
cluster = LocalCluster(
processes=processes,
n_workers=n_workers,
threads_per_worker=threads_per_worker,
n_workers=parallel if processes else 1,
threads_per_worker= 1 if processes else parallel,
)
return Client(cluster)
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment