-
Notifications
You must be signed in to change notification settings - Fork 1
haught/matlab-slurm
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Copyright 2010-2013 The MathWorks, Inc. This folder contains a number of files to allow Parallel Computing Toolbox to be used with SLURM via the generic cluster interface. The files in this folder assume that the client and cluster share a file system and that the client is able to submit directly to the cluster using the sbatch command. Note that all the files in this directory will work only for clusters that are running on UNIX. Instructions for Use ===================== On the Client Host ------------------ 1. The files in $MATLABROOT/toolbox/distcomp/examples/integration/slurm/shared must be present on the MATLAB path. Copy them to $MATLABROOT/toolbox/local or modify the MATLAB path from within MATLAB. 2. Read the documentation for using the generic cluster interface with the Parallel Computing Toolbox and familiarize yourself with the different properties that can be set for a generic cluster. In the MATLAB Client -------------------- 1. Create a generic cluster object for your cluster. For independent jobs, you must use independentSubmitFcn as your submit function. For communicating jobs, you must use communicatingSubmitFcn as your submit function. Example: % Use a folder that both the client and cluster can access % as the JobStorageLocation. If your cluster and client use % different operating systems, you should specify JobStorageLocation % to be a structure. Refer to the documentation on % generic cluster for more information. cluster = parallel.cluster.Generic('JobStorageLocation', '/home/JOB_STORAGE_LOCATION'); set(cluster, 'HasSharedFilesystem', true); set(cluster, 'ClusterMatlabRoot', '/apps/matlab'); set(cluster, 'OperatingSystem', 'unix'); set(cluster, 'IndependentSubmitFcn', @independentSubmitFcn); % If you want to run communicating jobs (including parallel pools), you must specify a CommunicatingSubmitFcn set(cluster, 'CommunicatingSubmitFcn', @communicatingSubmitFcn); set(cluster, 'GetJobStateFcn', @getJobStateFcn); set(cluster, 'DeleteJobFcn', @deleteJobFcn); 2. Create a job and some tasks, submit the job, and wait for it to finish before getting the results. Do the same for communicating jobs if so desired. As an alternative to these steps, create a profile that defines the appropriate properties and run profile validation to verify that the profile works correctly. Description of Files ==================== For more detail about these files, please refer to the help and comments contained in the files themselves. MATLAB Functions Required for generic cluster ---------------------------------------------- independentSubmitFcn.m Submit function for independent jobs. Use this as the IndependentSubmitFcn for your generic cluster object. communicatingSubmitFcn.m Submit function for communicating jobs. Use this as the CommunicatingSubmitFcn for your generic cluster object. deleteJobFcn.m Delete a job on the cluster. Use this as the DeleteJobFcn for your generic cluster object. getJobStateFcn.m Get the job state from the cluster. Use this as the GetJobStateFcn for your generic cluster object. Other MATLAB Functions ----------------------- extractJobId.m Get the cluster's job ID from the submission output. getSubmitString.m Get the submission string for the cluster. Executable Scripts ------------------- independentJobWrapper.sh Script used by the cluster to launch the MATLAB worker processes for independent jobs. communicatingJobWrapper.sh Script used by the cluster to launch the MATLAB worker processes for communicating jobs. Optional Customizations ======================== The code customizations listed in this section are clearly marked in the relevant files. independentSubmitFcn.m ---------------------- independentSubmitFcn provides the ability to supply additional submit arguments to the sbatch command. You may wish to modify the additionalSubmitArgs variable to include additional submit arguments that are appropriate to your cluster. For more information, refer to the sbatch documentation provided with your cluster. communicatingSubmitFcn.m ------------------------ communicatingSubmitFcn calculates the number of nodes to request from the cluster from the NumWorkersRange property of the communicating job. You may wish to customize the number of nodes requested to suit your cluster's requirements. communicatingSubmitFcn provides the ability to supply additional submit arguments to the sbatch command. You may wish to modify the additionalSubmitArgs variable to include additional submit arguments that are appropriate to your cluster. For more information, refer to the sbatch documentation provided with your cluster. communicatingJobWrapper.sh -------------------------- communicatingJobWrapper.sh uses the StrictHostKeyChecking=no and UserKnownHostsFile=/dev/null options for ssh. You may wish to customize the ssh options to suit your cluster's requirements. For more information, refer to your operating system the ssh documentation.
About
Modify Matlab R2010+ LSF distcomp shared scripts for SLURM scheduler
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published