Allocate 3 nodes with 1 task of 24 cpus each: ```shell salloc -N 3 -n 3 -c 24 ``` Within the allocation, make sure you have dask etc (e.g. by activating a virtual or conda env). Start a scheduler (on the first node) and three workers (one per node): ```shell $ srun -n1 -N1 -r0 dask-scheduler --scheduler-file scheduler.json &>> scheduler.log & $ srun -n3 -N3 --cpus-per-task=24 dask-worker --scheduler-file scheduler.json --nthreads=24 &>> worker.log & ```` Start a Python prompt and ```python In [1]: from dask.distributed import Client, progress In [2]: client = Client(scheduler_file='scheduler.json') In [3]: from dask import array as darr In [4]: xy = darr.random.uniform(0, 1, size=(1000e9 / 2 / 8, 2), chunks=(1e9 / 8 / 2, 2)) In [5]: pi = 4 * ((xy ** 2).sum(axis=-1) < 1.0).mean() In [6]: pi = pi.persist() In [7]: progress(pi) [########################################] | 100% Completed | 1min 2.6s In [8]: print(pi.compute()) 3.141586099776 ```