Ptolemy Documentation
Introduction
The Ptolemy Cluster consists of the following nodes:
Nodes and Specifications
| Node Name | Node Type | Cores/Node, CPU Type | Memory/Node, Configuration | GPUs/Node |
|---|---|---|---|---|
| ptolemy-login-[1-2] | Login | 128 cores (2x64 core 2.00GHz AMD EPYC Milan 7713) |
512 GB (16x 32GB 2Rx4 PC4-3200AA-R) |
N/A |
| ptolemy-dtn-1 | Data Transfer | 128 cores (2x64 core 2.00GHz AMD EPYC Milan 7713) |
512 GB (16x 32GB 2Rx4 PC4-3200AA-R) |
N/A |
| ptolemy-devel-[1-2] | Development | 128 cores (2x64 core 2.00GHz AMD EPYC Milan 7713) |
512 GB (16x 32GB 2Rx4 PC4-3200AA-R) |
1x NVIDIA A100 (80GB) mig=7 |
| ptolemy-gpu-[01-04] | Compute (A100 GPU) |
128 cores (2x64 core 2.00GHz AMD EPYC Milan 7713) |
1 TB (16x 64GB DDR-4 Dual Rank 3200MHz) |
8x NVIDIA A100 (80GB) mig=1 |
| ptolemy-gpu-[05-06] | Compute (A100 GPU) |
128 cores (2x64 core 2.00GHz AMD EPYC Milan 7713) |
1 TB (16x 64GB DDR-4 Dual Rank 3200MHz) |
8x NVIDIA A100 (80GB) mig=2 |
| ptolemy-gpu-[07-08] | Compute (A100 GPU) |
128 cores (2x64 core 2.00GHz AMD EPYC Milan 7713) |
1 TB (16x 64GB DDR-4 Dual Rank 3200MHz) |
8x NVIDIA A100 (80GB) mig=7 |
Partitions and Limits
| Partition | TotalNodes | Nodes | MaxNodes (Per Job) | MaxTime | DefMemPerCPU | Allowed Qos | DeviceName |
|---|---|---|---|---|---|---|---|
| gpu-a100 | 4 | ptolemy-gpu-[01-04] | QoS limited | QoS limited | 7686 | ALL | a100 |
| gpu-a100-mig2 | 2 | ptolemy-gpu-[05-06] | QoS limited | QoS limited | 7686 | ALL | a100_3g.40gb |
| gpu-a100-mig7 * | 2 | ptolemy-gpu-[07-08] | QoS limited | QoS limited | 7686 | ALL | a100_1g.10gb |
| development | 2 | ptolemy-devel-[1-2] | QoS limited | QoS limited | 3827 | ALL | NA |
| service | 1 | ptolemy-dtn-1 | QoS limited | QoS limited | 3827 | ALL | NA |
QoS's and Limits
| QoS | Priority | MaxNodes | MaxTime | Notes |
|---|---|---|---|---|
| normal | 20 | 1 | 48 Hours | Default QoS, Limits cpu=128 gres/gpu:a100=4 gres/gpu:a100_1g.10gb=14 gres/gpu:a100_3g.40gb=8 mem=1T |
Storage Space
homes - /home/$USER : home directory, quota, backed up, no scrub
work - /work/$CLUSTER : working storage, quota, not backed up, no scrub
scratch - /scratch/$CLUSTER : scratch storage, no quota, not backed up, scrubbed, see /scratch/$CLUSTER/README reference - /reference : for reference data set, no quota, backed up, no scrub, by request only, see the README
Accessing Ptolemy
ssh <UserID>@Ptolemy-login.arc.msstate.edu ssh <UserID>@Ptolemy-dtn.arc.msstate.eduOpenSSH or PuTTY implementations of ssh are recommended. When you log in, you will be on the a login node. The login node is a shared resource among all users that are currently logged in to the system. Please do NOT run computationally or memory intensive tasks on the login node, this will negatively impact performance for all other users on the system. See the Slurm section for instructions on how to run such tasks on compute nodes.
File Transfers
MsStateHPC#Ptolemy-dtnFor small amounts of data that need to be transferred to user home directories, the scp command can be used. This command copies files between hosts on a network. It uses ssh for data transfer, and uses the same authentication and provides the same security as ssh. SCP will ask for passwords as well as two-factor authentication codes.
To copy a file from a remote host to local host:
$ scp <username>@<remotehost>:/path/to/file.txt /local/directory/To copy a file from a local host to a remote host:
$ scp /path/to/file.txt <username>@<remotehost>:/remote/directory/To copy a directory from a remote host to local host:
$ scp -r <username>@<remotehost>:/remote/directory /local/directoryTo copy a directory from a local host to a remote host:
$ scp -r /local/directory <username>@<remotehost>:/remote/directory
Internet Connectivity
Modules
Ptolemy uses a heirarchy based on the Compilers and MPI implementations. Software in the Core tree is built using the default system compilers. Software built against a specific compiler is available only after that compiler module has been loaded. Software built against a specific MPI implementation is available only after that MPI module has been loaded. Information on available modules can be found with the "module avail" and "module spider" commands:
$ module spider quantum-espresso-gpu
------------------------------------------------------------------------------------------------------------------
quantum-espresso-gpu:
------------------------------------------------------------------------------------------------------------------
Versions:
quantum-espresso-gpu/develop
------------------------------------------------------------------------------------------------------------------
For detailed information about a specific "quantum-espresso-gpu" package (including how to load the modules) use the module's full name.
Note that names that have a trailing (E) are extensions provided by other modules.
For example:
$ module spider quantum-espresso-gpu/develop
------------------------------------------------------------------------------------------------------------------
Storage Space
Data in this location is not backed up and cannot
be restored if deleted!
It is important that users submit and run jobs from their respective
/work directories instead of their home directories. The
/home filesystem is not designed or configured for high
performance use, nor does it have much space on it. Home directories will run
out of space quickly on parallel jobs and will cause jobs to fail. After useful
data is generated from supercompute jobs, it is recommended that users transfer
this data to a more long-term storage location./reference/ptolemy is the location for reference datasets. It is populated by HPC2 upon request.
Local Scratch Space
TMPDIR=/local/scratch/$SLURM_JOB_USER/$SLURM_JOB_IDYou can use this for any scratch disk space you need, or if you plan to compute on an existing large data set (such as a sequence assembly job) it might be beneficial to copy all your input data to this space at the beginning of your job, and then do all your computation on $TMPDIR. You must copy any output data you need to keep back to permanent storage before the job ends, since $TMPDIR will be erased upon job exit. The following example shows how to copy data in, and then run from $TMPDIR:
#!/bin/bash -l #SBATCH --job-name="TMPDIR example" #SBATCH --partition=ptolemy #SBATCH --account=AccountName #SBATCH --nodes=1 #SBATCH --ntasks=48 #SBATCH --time=08:00:00 # Always good practice to reset environment when you start module purge # start staging data to the job temporary directory in $TMPDIR MYDIR=`pwd` /bin/cp -r $MYDIR $TMPDIR/ cd $TMPDIR # add regular job commands like module load # and commands to launch scientific software # copy output data off of local scratch /bin/cp -r output $MYDIR/output$TMPDIR is defined as the above directory at the beginning of every job, before the job scripts are executed. Users must overwrite this definition inside of the batch script itself if TMPDIR needs to be set to a different location.
Arbiter
| Status | CPU Cap | Memory Cap | Penalty Timeout |
|---|---|---|---|
| Normal | 4 Cores | 50 GB | N/A |
| Penalty1 | 3 Cores | 40 GB | 30 Minutes |
| Penalty2 | 2 Cores | 25 GB | 1 Hour |
| Penalty3 | 1 Cores | 15 GB | 2 Hours |
| Penalty4 | 0.2 Cores | 5 GB | 4 Hours |
Slurm
Slurm has three primary job allocation commands which accept almost identical options:
- SBATCH Submits a job runscript for later execution (batch mode)
- SALLOC Creates a job allocation and starts a shell to use it (interactive mode)
- SRUN Creates a job allocation and launches the job step (typically an MPI job)
Example salloc usage:
user_name@Ptolemy-login-1 ~$ salloc -A account_name salloc: Pending job allocation 527990 salloc: job 527990 queued and waiting for resources salloc: Granted job allocation 527990 salloc: Waiting for resource configuration salloc: Nodes Ptolemy-gpu-02 are ready for job user_name@Ptolemy-login-1 ~$ srun hostname Ptolemy-gpu-02.arc.MsState.EduThe srun command can be used to launch an interactive shell on an allocated node or set of nodes. Simply specify the --pty option while launching a shell (such as bash) with srun. It is also recommended to set the wallclock limit along with the number of nodes and processors needed for the interactive shell.
Example interactive shell:
user_name@Ptolemy-login-1 ~$ srun -A account_name --pty --preserve-env bash srun: job 527987 queued and waiting for resources srun: job 527987 has been allocated resources user_name@Ptolemy-gpu-02 ~$ hostname Ptolemy-gpu-02.arc.MsState.EduWhen running batch jobs, it is necessary to interact with the job queue. It is usually helpful to be able to see information about the system, the queue, the nodes, and your job. This can be accomplished a set of important commands:
- SQUEUE Displays information about jobs in the scheduling queue.
- SJSTAT Displays short summary of running jobs and scheduling pool data.
- SHOWUSERJOBS Displays short summary of jobs by user and account, along with a summary of node state.
- SHOWPARTITIONS Displays short summary and current state of the available partitions.
- SSTAT Displays information about specific jobs.
- SINFO Reports system status (nodes, queues, etc).
- SACCT Displays accounting information from the Slurm database.
The default walltime is 15 minutes. Any jobs that do not specify a walltime will be terminated 15 minutes after starting.
The default allocation is 1 node. Any jobs that do not specify the number of nodes will run on one node.
The default number of tasks is 1 core. Any jobs that do not specify the number of tasks will run on only 1 core.
When submitting jobs, all users must specify a valid account that they
are associated with.
To see which accounts you are on, along with valid QoS's for that account, use
the following command:$ sacctmgr show associations where user=$USER format=account%20,qos%50
Nodesharing
Slurm allocates all of a node's memory by default, so in order to take advantage of nodesharing, users must specify the memory required per node for their jobs using the --mem option in their runscript or srun command. Specifying a memory limit with the --mem option will ensure that user jobs are allocated the amount specified. For example, if a user's job only needs 150 GB of memory per node, the user must specify the following sbatch directive:
$ srun -n 10 -N 2 --mem=150G ./example_programIf a user requests 10 cores and 50 GB of memory for one job, along with 10 cores and 50 GB of memory for a second job, then both of these jobs may run on the same node. The same principle would also work for jobs owned by two different users.
In order to disallow sharing the remainder of the cores while running on less than 48 cores, users must specify the "--exclusive" option in their runscripts or in their salloc/srun commands:
$ srun -n 10 -N 1 --exclusive ./example_programThe gpu and bigmem partitions will give users exclusive nodes by default.
Job Dependencies and Pipelines
sbatch --dependency=<type:job_id[:job_id][,type:job_id[:job_id]]> ...Dependency types:
| after:jobid[:jobid...] | job can begin after the specified jobs have started |
| afterany:jobid[:jobid...] | job can begin after the specified jobs have terminated |
| afternotok:jobid[:jobid...] | job can begin after the specified jobs have failed |
| afterok:jobid[:jobid...] | job can begin after the specified jobs have run to completion with an exit code of zero |
| singleton | jobs can begin execution after all previously launched jobs with the same name and user have ended |
$ sbatch job1.sh 11254323 $ sbatch --dependency=afterok:11254323 job2.shNow when job1 ends with an exit code of zero, job2 will become eligible for scheduling. However, if job1 fails (ends with a non-zero exit code), job2 will not be scheduled but will remain in the queue and needs to be canceled manually. As an alternative, the afterany dependency can be used and checking for successful execution of the prerequisites can be done in the jobscript itself.
Container Notes
module load apptainer/1.0.2Apptainer is configured on Ptolemy such that users do not have to define additional environment variables to have access to their working folders in the container. However, users wishing to utilize the "remote build" features will need to unset the APPTAINER_BIND variable.
Many containers are available on Ptolemy, and inquiries about accessing existing containers or adding new containers may be submitted by emailing help@hpc.msstate.edu.
Python/Miniconda Notes
Create an Environment in Python:
python3 -m venv $MYDIR/python-env source $MYDIR/python-env/bin/activate pip3 install --upgrade pip pip3 install matplotlib scikit-learn torch keras tensorflowAccessing the Environment:
source /reference/ptolemy/class/me8213/python-env/bin/activateCreating an Environment in Conda:
module purge module load miniconda3/24.3.0 export CONDA_PKGS_DIRS=/tmp/$USER mkdir $CONDA_PKGS_DIRS conda create --prefix $MYDIR/conda-environments/env-name numpy pandas matplotlib seaborn scikit-learn ipykernel source activate $MYDIR/conda-environments/env-name python -m ipykernel install --name env-name --display-name "env-name" --user conda deactivateAccessing the Environment:
source /reference/ptolemy/class/me8213/python-env/bin/activate
Do not run 'conda init'. It changes your shell behavior in a way that
is not desirable.
OnDemand is a user-friendly front-end interface for access to the Ptolemy
Cluster resources. Review the Ptolemy Open On Demand documentation for more
information.