Direct SSH (Terminal / Batch Submission)¶
The browser portal is the supported path for actually getting work done. Direct SSH exists for terminal-driven batch operations only — writing and submitting sbatch scripts, checking the queue (
squeue,sacct), looking at log files on lab storage. All compute work must go through Slurm (sbatchfor batch,srun --ptyfor an interactive allocation). A barepython train.pyormatlabin your SSH shell bypasses the scheduler and will get killed without warning.
For VS Code / Cursor: use the OOD tile, not Remote-SSH¶
Do not point VS Code Remote-SSH at hpc.eng.umd.edu. That host
is the Slurm controller and OOD portal; Remote-SSH would start a
Node.js server, language servers, file watchers, and indexing
processes on it — degrading the scheduler for the entire cluster.
The supported path is the VS Code (code-server) OOD app:
- Open https://hpc.eng.umd.edu → Interactive Apps → VS Code (code-server).
- Pick the lab profile (your lab's node), resources, and
optionally a
workdirto open. Submit. - The session card gives you a Connect to VS Code link — full
VS Code in your browser, running on your lab's GPU node under a
Slurm allocation, with
/opt/swmodules and your lab share available.
Prerequisites¶
- A UMD Directory ID and AD password.
- Membership in the lab's AD group (your PI / lab admin manages this).
- AD password or Kerberos ticket only. SSH keys are not accepted
on these hosts. Don't paste your public key into
~/.ssh/authorized_keysand expect it to work.
The one SSH target you should use¶
That host is the Slurm submit host — use it only for the small,
cheap operations a submit host is meant for: writing/editing
sbatch scripts (a few minutes of vim), submitting them, and
checking on them with squeue / sacct. To do real work on a GPU,
use srun --pty bash -l from this host — Slurm will drop you into
an interactive shell inside an allocation, on hardware that can
actually run your job.
Things to keep off the submit host:
- Anything Node-shaped or IDE-shaped (VS Code Remote-SSH, language
servers,
npm install, big builds). - Sustained compilation, training, simulation, data crunching.
- Long-running
tmuxsessions doing real work.
The submit host doesn't have GPUs, doesn't have your lab share's compute capacity, and is shared by every lab — it's a thin shell for talking to Slurm, not a workstation.
SSH — password auth¶
You'll be prompted for your AD password. That's it. (SSH keys are not accepted — see "Prerequisites" above.)
Submitting a batch job from hpc.eng.umd.edu¶
Once connected to the submit host:
cd /mnt/lab-research/jobs # cifscreds first if needed (below)
vi run.sh # write your sbatch script
sbatch run.sh # → Submitted batch job 123
squeue -u $USER # check status
sacct -j 123 --format=JobID,State,Elapsed # after it finishes
The submit host doesn't run your job — Slurm dispatches it to a compute node. You can log out; the job keeps running. See slurm-cli.md for sbatch directives and patterns.
Interactive shell on a GPU node¶
From the submit host, ask Slurm for an interactive allocation:
srun --partition=<your-lab> --time=01:00:00 \
--cpus-per-task=4 --mem=16G --gres=gpu:1 \
--pty bash -l
Slurm picks a node in your lab's set, allocates the resources to you, and drops you into a shell on it. Exit to release the allocation.
SSH with Kerberos (no password prompt, optional)¶
If you SSH frequently and prefer not to type a password each time, get
a Kerberos ticket against AD.UMD.EDU:
macOS / Linux¶
# Get a Kerberos ticket
kinit <directoryID>@AD.UMD.EDU
# Verify
klist
# Configure SSH to use GSSAPI
cat >> ~/.ssh/config <<'EOF'
Host *.ad.umd.edu *.eng.umd.edu
GSSAPIAuthentication yes
GSSAPIDelegateCredentials yes
EOF
kinit is built in on macOS; on Ubuntu install with
apt install krb5-user.
Windows¶
PowerShell with OpenSSH for Windows works for password auth out of the box. For Kerberos, MIT Kerberos for Windows is the closest parallel; WSL2 Ubuntu is the easier path — do the Linux setup inside WSL.
Mount your lab storage¶
When you SSH to the submit host, lab shares are mounted there but your user's credentials aren't in the kernel keyring yet. Once per SSH session:
To find your lab's NAS hostname, run mount | grep cifs inside an
OOD desktop session (the share's UNC path is the server), or ask
eit-help@umd.edu.
If your lab has multiple NAS servers, repeat cifscreds add for each.
Clean up when done:
Running Slurm from SSH¶
sbatch, squeue, sacct are all available from SSH — that's
what this SSH path is for. See slurm-cli.md for how
to submit jobs.
When SSH is the right tool¶
- Submitting and managing batch jobs from your laptop's terminal without launching a desktop.
- Quick log inspection on lab storage after a job finishes.
- Editing small sbatch scripts in vim/nano.
Prefer the portal for everything else — graphical apps, 3D viz, JupyterLab notebooks, VS Code, interactive GPU work, new users. The portal handles storage credentials, GPU rendering, lab-specific software menus, and Slurm allocation automatically.
If you specifically want VS Code or Cursor, use the VS Code
(code-server) OOD app — full VS Code in your browser, running on
your lab's GPU node through Slurm. See the warning at the top of
this page for why Remote-SSH against hpc.eng.umd.edu is harmful.
Limits¶
- Your home on the submit host is separate from the home directories
inside OOD sessions — files you have in
~over SSH are not the same files you see in~inside a desktop session. Keep work on lab storage and use it from either path. - If you leave a long-running process via
srun --pty bash -l, usetmuxinside the allocation to protect it from your network dropping. - The submit host is not where your job actually runs. Anything
that would peg a CPU or use real RAM belongs in
sbatch/srun, not in your SSH shell onhpc.eng.umd.edu. This includes IDE language servers, watch processes, big builds, and conda solves.
Troubleshooting¶
| Symptom | Cause / fix |
|---|---|
| "Permission denied (publickey,gssapi-keyex,password)" | Wrong password, or your AD account isn't in the lab's AD group. Confirm your password works for the OOD portal first; then ask your PI / lab admin about group membership. |
| "Permission denied" after pasting an SSH key | SSH keys aren't accepted on these hosts. Use AD password or Kerberos. |
| "Permission denied (gssapi-keyex, ...)" only | Kerberos ticket expired; kinit <user>@AD.UMD.EDU again. (Password auth still works.) |
| "Host key verification failed" | First time connecting, or the node was reinstalled. Remove old line from ~/.ssh/known_hosts and retry. |
| Lab shares are empty after SSH | You didn't run cifscreds add yet, or AD password was wrong. |
ssh hangs on connect |
Probably a VPN / network issue. Confirm you're on a UMD network or the UMD VPN. |