CUDA Programmer
Location: Waukesha, WI
We are seeking a skilled CUDA Programmer to design, develop, and optimize high-performance applications on NVIDIA GPUs. The role focuses on accelerating compute-intensive workloads, optimizing memory usage, and collaborating with system and application teams to maximize GPU performance.
Key Responsibilities
· Profile and tune GPU applications for performance, memory efficiency, and scalability.
· Work with CPU–GPU parallel programming models and optimize data transfer.
· Leverage NVIDIA libraries (CUDA, cuBLAS, cuDNN, NCCL as applicable).
· Collaborate with system, compute, or AI/ML teams to integrate GPU-accelerated components.
· Debug GPU kernels and address performance bottlenecks using NVIDIA profiling tools.
· Ensure portability and performance across different NVIDIA GPU architectures.
Required Skills
· Strong experience in CUDA programming and parallel computing concepts.
· In-depth understanding of NVIDIA GPU architecture (threads, warps, SMs, memory hierarchy).
· Proficiency in C/C++ for high-performance computing.
· Experience with CUDA profiling and debugging tools (Nsight, nvprof).
· Solid understanding of multi-threading, memory optimization, and performance tuning.
Preferred Skills
· Experience with AI/ML, HPC, or graphics workloads on GPUs.
· Familiarity with multi-GPU programming and communication frameworks (NCCL, MPI).
· Exposure to Python bindings (CUDA Python, PyTorch extensions).
· Experience with Linux-based development environments.