Skip to content

ruilinchu/matlab-slurm

 
 

Repository files navigation

Copyright 2010-2013 The MathWorks, Inc.

This folder contains a number of files to allow Parallel Computing Toolbox
to be used with SLURM via the generic cluster interface.

download and put these scripts in ~/matlab/ 

It's highly recommanded that you put

export TZ=America/Los_Angeles 

in your ~/.bashrc file, or you will see complains from Matlab.


* To start a matlab pool on a single(compute) node, use

cluster=get_LOCAL_cluster

before you can use parpool or batch to start the pool and use parfor or spmd. For a more detailed example, see example_local.m file.


* To start a matlab pool spanning multiple(compute) nodes through SLURM, use

cluster=get_SLURM_cluster('-t 10:00:00 --mem-per-cpu=2G') 

before you can use parpool or batch to start the pool and use parfor or spmd. Here "-t 10:00:00" is an example input for slurm sbatch options, you can put more options here, e.g. account request, memory request etc.

But don't put "-n" (number of tasks) here, "-n" option is requested throught the parpool or batch command. Read Matlab's documentation about parpool and batch for details.

The job's output and logs from Matlab will be stored in ~/MATLAB_JOB_STORAGE_local and ~/MATLAB_JOB_STORAGE, if you wish to change the locations, edit it in get_LOCAL_cluster.m and get_SLURM_cluster.m.
Matlab's terminal output can be captured by the diary cmd, for details, read Matlab's documentation.

>> c=get_SLURM_cluster('-t 4:00:00 --mem-per-cpu=2G -p isi')

c = 

 Generic Cluster

    Properties: 

                          Profile: 
                         Modified: true
                             Host: hpc1458.hpcc.usc.edu
                       NumWorkers: Inf
                       NumThreads: 1

               JobStorageLocation: /staging/hpcc/jimichu/MATLAB_JOB_STORAGE
                ClusterMatlabRoot: /usr/usc/matlab/default/
                  OperatingSystem: unix

             IndependentSubmitFcn: [1x2 cell]
           CommunicatingSubmitFcn: [1x2 cell]
                   GetJobStateFcn: @getJobStateFcn
                     CancelJobFcn: []
                    CancelTaskFcn: []
                     DeleteJobFcn: @deleteJobFcn
                    DeleteTaskFcn: []
 RequiresMathWorksHostedLicensing: false

    Associated Jobs: 

                   Number Pending: 0
                    Number Queued: 1
                   Number Running: 4
                  Number Finished: 0

>> j=c.batch('demo_spmd','Pool',90,'CaptureDiary',true)    

j = 

 Job

    Properties: 

                   ID: 6
                 Type: pool
             Username: jimichu
                State: queued
       SubmitDateTime: 30-Mar-2018 11:07:37
        StartDateTime: 
     Running Duration: 0 days 0h 0m 0s
      NumWorkersRange: [91 91]
           NumThreads: 1

      AutoAttachFiles: true
  Auto Attached Files: /home/rcf-00/jimichu/matlab/demo_spmd.m
        AttachedFiles: {}
      AdditionalPaths: {}

    Associated Tasks: 

       Number Pending: 91
       Number Running: 0
      Number Finished: 0
    Task ID of Errors: []
  Task ID of Warnings: []

>> j.State

ans =

finished
>> j.diary
--- Start Diary ---

ans =

     1     2     3     4     5


ans =

     1     2     3     4     5


ans =

    11    12    13    14    15
...
Lab  1: 
  this is display from worker: 1
Lab  3: 
  this is display from worker: 3
Lab  4: 
  this is display from worker: 4
...
Lab 76: 
  this is display from worker: 76
Lab 80: 
  this is display from worker: 80
Lab 81: 
  this is display from worker: 81
Lab 90: 
  this is display from worker: 90

>> jj=c.findJob

jj = 

 9x1 Job array:
 
         ID           Type        State              FinishDateTime  Username  Tasks
       -----------------------------------------------------------------------------
    1     1           pool     finished                               jimichu     81
    2     2           pool     finished                               jimichu    129
    3     3           pool     finished                               jimichu     91
    4     4           pool     finished                               jimichu     91
    5     5           pool     finished        30-Mar-2018 11:08:12   jimichu     91
    6     6           pool       queued                               jimichu     91
    7     7           pool     finished                               jimichu    161
    8     8           pool       queued                               jimichu    129
    9     9           pool       queued                               jimichu    129

>> jj(9).State

ans =

running

About

SLURM integration scripts for Matlab

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • MATLAB 83.8%
  • Shell 15.2%
  • M 1.0%