Python: Local vs. Cluster

Understanding how Python workflows change when moving from your laptop to a cluster.


Running Python Locally (Interactive)

On your laptop, you might work like this in Jupyter or VSCode:

# You run this cell and see output immediately
import numpy as np
data = np.random.rand(1000, 1000)
result = np.mean(data)
print(f"Mean: {result}")
# Output appears right away: Mean: 0.4998234...

Running Python on the Cluster (Batch)

Instead, you create files and submit them as a job.

Common Mistake for New Users
When you SSH into the cluster and see a command prompt, you might think: “Great! I’m on the cluster, let me just run python my_script.py!”
❌ This is wrong! You’re on a shared login node. Running code directly impacts all other users.
✅ The right way: Create your script, create a submission file, and submit it as a job.

Step 1: Create your Python script

File: my_analysis.py

#!/usr/bin/env python3
import numpy as np

# Your computation here
data = np.random.rand(1000, 1000)
result = np.mean(data)

# Save results to file instead of printing to screen
with open('results.txt', 'w') as f:
    f.write(f"Mean: {result}\n")

print("Analysis complete!")

Step 2: Create a job submission script

For Zest (Slurm):

File: submit_job.sh

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=1
#SBATCH --time=00:10:00
#SBATCH --mail-type=ALL
#SBATCH --mail-user=netid@syr.edu

# Load Python environment
module load anaconda3

# Run your script
python my_analysis.py

For OrangeGrid (HTCondor):

File: job.sub

executable = /usr/bin/python3
arguments = my_analysis.py
output = job.$(cluster).$(process).out
error = job.$(cluster).$(process).err
log = job.$(cluster).log
queue 1

OrangeGrid Note: Your home directory is mounted on compute nodes, so no file transfer is needed!

Step 3: Submit the job

On Zest:

sbatch submit_job.sh

On OrangeGrid:

condor_submit job.sub

Step 4: Check status and retrieve results

On Zest:

squeue                 # Check job status
cat results.txt        # View your results

On OrangeGrid:

condor_q               # Check job status
cat results.txt        # View your results

Key Difference: Instead of seeing output immediately in a notebook, you submit the job, it runs when resources are available, and you retrieve results from output files later.