CUDA Expert

2 weeks ago


Ramat Gan, Tel Aviv, Israel AAI Technologies Full time ₪120,000 - ₪180,000 per year

To lead the design and optimization of state-of-the-art GPU kernels powering advanced deep learning applications and large-scale workloads.

What you'll do

  • You will drive cutting edge GPU kernel development and optimization across diverse, high-impact use cases.
  • You'll work at the intersection of hardware and algorithms – accelerating complex workloads and shaping the future of high-performance AI computation.
  • Design, implement, and optimize GPU kernels for diverse deep learning workloads
  • Collaborate with research and engineering teams to deliver state-of-the-art performance on modern GPU architectures
  • Profile, debug, and fine-tune large-scale training and inference pipelines
  • Stay ahead of the curve on GPU trends and emerging CUDA technologies

What you bring

  • Proven experience in 
    GPU programming and parallel computing
  • Hands-on experience in
    GPU kernel development for deep learning workloads
    (within the past 3 years)
  • Strong understanding of
    modern GPU architectures –
    particularly Invidia
    Blackwell (B200)
    OR
    Hopper (H100)
  • Familiarity with CUTLASS (Cuda framework)
  • Proficiency with 
    GPU profiling, debugging, and performance optimization tools
  • Excellent 
    analytical and problem-solving
     abilities

Bonus points for

  • Experience 
    optimizing large-scale workloads
  • Familiarity with Torch
  • Familiarity with 
    modern CUDA frameworks
     ( CuTe, CUB)
  • Background in 
    AI model training and inference optimization