Simple CUDA development setup for Lambda Labs instances. No Docker required—Lambda provides CUDA pre-installed.
# Clone to persistent storage
cd /home/ubuntu/userspace
git clone https://github.com/5arast1/lambda-dev-env.git
cd lambda-dev-env
# Setup (one time)
chmod +x *.sh
./setup.sh
# Open new terminal or reload bashrc
source ~/.bashrc
# Validate
./validate.shcd /home/ubuntu/userspace/lambda-dev-env/workspace
# Write your CUDA code
vim kernel.cu
# Compile (GH200/H100 = sm_90, A100 = sm_80)
nvcc -O3 -arch=sm_90 -o kernel kernel.cu
# Run
./kernel
# Profile
ncu --set full ./kernellambda-dev-env/
├── setup.sh # First-time setup
├── validate.sh # Test environment
├── workspace/ # Your code (persisted)
└── scripts/ # Examples
└── vector_add.cu
# Auto-detect GPU architecture
ARCH=$(nvidia-smi --query-gpu=compute_cap --format=csv,noheader | tr -d ".")
nvcc -O3 -arch=sm_${ARCH} -o out in.cu
# Common architectures
nvcc -arch=sm_90 ... # GH200, H100 (Hopper)
nvcc -arch=sm_89 ... # L40 (Ada)
nvcc -arch=sm_86 ... # A10 (Ampere)
nvcc -arch=sm_80 ... # A100 (Ampere)
# With debug symbols
nvcc -G -g -arch=sm_90 -o out_debug in.cu
# Generate PTX
nvcc -ptx -arch=sm_90 in.cu# Nsight Compute (kernel profiling)
ncu ./kernel
ncu --set full ./kernel
ncu -o report --set full ./kernel # Save report
# Nsight Systems (timeline)
nsys profile ./kernel
nsys profile -o timeline ./kernel- Persistent storage:
/home/ubuntu/userspacesurvives instance stops - CUDA version: 12.8 (pre-installed)
- GPU: GH200 480GB (sm_90, 132 SMs, 94.5GB HBM3)
Lambda's NFS-backed persistent storage doesn't support Docker's overlayfs operations. Since Lambda already provides a complete CUDA environment, Docker adds complexity without benefit. Use Docker only when you need reproducibility across different machines.
cd workspace
cp ../scripts/vector_add.cu .
nvcc -O3 -arch=sm_90 -o vector_add vector_add.cu
./vector_add