CUDA Expert
2 weeks ago
Ramat Gan, Tel Aviv, Israel
AAI Technologies
Full time
₪120,000 - ₪180,000 per year
To lead the design and optimization of state-of-the-art GPU kernels powering advanced deep learning applications and large-scale workloads.
What you'll do
- You will drive cutting edge GPU kernel development and optimization across diverse, high-impact use cases.
- You'll work at the intersection of hardware and algorithms – accelerating complex workloads and shaping the future of high-performance AI computation.
- Design, implement, and optimize GPU kernels for diverse deep learning workloads
- Collaborate with research and engineering teams to deliver state-of-the-art performance on modern GPU architectures
- Profile, debug, and fine-tune large-scale training and inference pipelines
- Stay ahead of the curve on GPU trends and emerging CUDA technologies
What you bring
- Proven experience in
GPU programming and parallel computing - Hands-on experience in
GPU kernel development for deep learning workloads
(within the past 3 years) - Strong understanding of
modern GPU architectures –
particularly Invidia
Blackwell (B200)
OR
Hopper (H100) - Familiarity with CUTLASS (Cuda framework)
- Proficiency with
GPU profiling, debugging, and performance optimization tools - Excellent
analytical and problem-solving
abilities
Bonus points for
- Experience
optimizing large-scale workloads - Familiarity with Torch
- Familiarity with
modern CUDA frameworks
( CuTe, CUB) - Background in
AI model training and inference optimization