Senior ML Engineer, AI Infrastructure
2 weeks ago
The Senior Machine Learning Engineer – GenAI
is responsible for designing, implementing, and operating large-scale systems and tools for AI model benchmarking, optimization, and validation. Unlike a traditional ML Engineer focused primarily on model training, this role centers on
building the infrastructure, automation, and services that enable systematic evaluation and performance tuning of LLMs at scale
.
This position combines
deep understanding of model serving frameworks, GPU optimization, and benchmarking methodologies
with strong software engineering skills to deliver reliable, reproducible, and production-grade evaluation pipelines. The engineer will design and maintain
validation-as-a-service platforms
that allow internal and external stakeholders to assess models across latency, throughput, accuracy, and cost dimensions—integrating seamlessly with Red Hat's AI ecosystem and industry-standard GenAI tooling.
A core aspect of this role is
creating a robust, extensible benchmarking and validation framework
capable of running across diverse inference engines, hardware configurations, and deployment environments, while providing actionable insights for model selection, optimization, and integration.
What You Will Do
- Benchmarking Platform Development: Design and implement scalable benchmarking pipelines for LLM performance measurement (latency, throughput, accuracy, cost) across multiple serving backends and hardware types.
- Optimization Tooling: Build utilities and automation to profile, debug, and optimize inference performance (GPU utilization, memory footprint, CUDA kernels, parallelism strategies).
- Validation-as-a-Service: Develop APIs and self-service platforms for model evaluation, enabling teams to run standardized benchmarks on demand.
- Serving Integration: Integrate and operate high-performance serving frameworks (vLLM, TGI, LMDeploy, Triton) with cloud-native deployment patterns.
- Dataset & Scenario Management: Create reproducible workflows for dataset preparation, augmentation, and scenario-based testing to ensure robust evaluation coverage.
- Observability & Diagnostics: Implement real-time monitoring, logging, and metrics dashboards (Prometheus, Grafana) for benchmark and inference performance.
- Cloud-Native Orchestration: Deploy and manage benchmarking workloads on Kubernetes (Helm, Argo CD, Argo Workflows) across AWS/GCP GPU clusters.
- Integration with GenAI Tooling: Leverage Hugging Face Hub, OpenAI SDK, LangChain, LlamaIndex, and internal frameworks for streamlined evaluation workflows.
- Performance Engineering: Identify bottlenecks, apply targeted optimizations, and document best practices for inference scalability.
- Ecosystem Leadership: Track emerging frameworks, benchmarks, and optimization techniques to continuously improve the evaluation platform.
What You Ill Bring
- Advanced Python for backend development, data processing, and ML/GenAI pipelines.
- Kubernetes (Deployments, Services, Ingress) and Helm for large-scale distributed training and inference workloads.
- LLM training, fine-tuning, and optimization (PyTorch, DeepSpeed, HF Transformers, LoRA/PEFT).
- GPU optimization expertise: CUDA, mixed precision, tensor/sequence parallelism, memory management, and throughput tuning.
- High-performance model serving with vLLM, TGI, LMDeploy, Triton, and API-based serving (OpenAI, Mistral, Anthropic).
- Benchmarking and evaluation pipelines: dataset preparation, accuracy/latency/throughput measurement, and cost–performance tradeoffs.
- Multi-model, multi-engine comparative testing for optimal deployment decisions.
- Hugging Face Hub for model/dataset management, including private hosting and pipeline integration.
- GenAI development tools: OpenAI SDK, LangChain, LlamaIndex, Cursor, Copilot.
- Argo CD & Argo Workflows for reproducible, automated ML pipelines.
- CI/CD (GitHub Actions, Jenkins) for ML lifecycle automation.
- Cloud (AWS/GCP) for provisioning, running, and optimizing GPU workloads (A100, H100, etc.).
- Monitoring & observability (Prometheus, Grafana) and databases (PostgreSQL, SQLAlchemy).
Nice To Have
- Distributed training across multi-node, multi-GPU clusters.
- Advanced model evaluation: bias/fairness, robustness, and domain-specific benchmarks.
- Experience with OpenShift/RHOAI for enterprise AI deployments.
- Benchmarking frameworks: GuideLLM, HELM, Eval Harness.
- Security scanning for artifacts/containers (Trivy, Grype).
- Tradeoff-analysis tooling for model selection and deployment
About Red Hat
Red Hat is the world's leading provider of enterprise open source software solutions, using a community-powered approach to deliver high-performing Linux, cloud, container, and Kubernetes technologies. Spread across 40+ countries, our associates work flexibly across work environments, from in-office, to office-flex, to fully remote, depending on the requirements of their role. Red Hatters are encouraged to bring their best ideas, no matter their title or tenure. We're a leader in open source because of our open and inclusive environment. We hire creative, passionate people ready to contribute their ideas, help solve complex problems, and make an impact.
Inclusion at Red Hat
Red Hat's culture is built on the open source principles of transparency, collaboration, and inclusion, where the best ideas can come from anywhere and anyone. When this is realized, it empowers people from different backgrounds, perspectives, and experiences to come together to share ideas, challenge the status quo, and drive innovation. Our aspiration is that everyone experiences this culture with equal opportunity and access, and that all voices are not only heard but also celebrated. We hope you will join our celebration, and we welcome and encourage applicants from all the beautiful dimensions that compose our global village.
Equal Opportunity Policy (EEO)
Red Hat is proud to be an equal opportunity workplace and an affirmative action employer. We review applications for employment without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, ancestry, citizenship, age, veteran status, genetic information, physical or mental disability, medical condition, marital status, or any other basis prohibited by law.
Red Hat does not seek or accept unsolicited resumes or CVs from recruitment agencies. We are not responsible for, and will not pay, any fees, commissions, or any other payment related to unsolicited resumes or CVs except as required in a written contract between Red Hat and the recruitment agency or party requesting payment of a fee.
Red Hat supports individuals with disabilities and provides reasonable accommodations to job applicants. If you need assistance completing our online job application, email application- General inquiries, such as those regarding the status of a job application, will not receive a reply.
-
Senior ML Engineer, AI Infrastructure
6 days ago
Raanana, Central District, Israel Red River Full time ₪80,000 - ₪150,000 per yearThe Senior Machine Learning Engineer – GenAI is responsible for designing, implementing, and operating large-scale systems and tools for AI model benchmarking, optimization, and validation. Unlike a traditional ML Engineer focused primarily on model training, this role centers on building the infrastructure, automation, and services that enable...
-
Senior AI Developer
2 days ago
Raanana, Center District, Israel Unilink Ltd. Full time ₪90,000 - ₪120,000 per yearWe're Hiring:Senior AI Developer – Agentic AI SolutionsAre you passionate about building next-generation AI systems that drive real business impact? Join our growing Agentic AI team as a Senior AI Developer and help architect and develop intelligent, production-grade agents and RAG systems that transform how enterprises operate.What You'll Do:Design and...
-
Computer Vision Engineer
2 weeks ago
Emek Hefer Regional Council, Center District, Israel Dynamic Infrastructure Full time ₪90,000 - ₪120,000 per yearCompany DescriptionAt Dynamic Infrastructure, we redefine how civil infrastructure is understood and maintained. By analyzing visual and textual inspection data, we generate reliable, objective insights into the condition and evolution of bridges, culverts, and other critical assets.We work with real-world data at scale, building systems that directly...
-
AI/ML Software Engineering Intern for the WiFi
2 weeks ago
Raanana, Center District, Israel Texas Instruments Full time ₪40,000 - ₪60,000 per yearJob DescriptionChange the world. Love your job.We are seeking an innovative engineer to pioneer AI-powered solutions that will revolutionize hardware development workflows. Join us in building next-generation AI applications that will transform and accelerate the future of HW engineering.Your Responsibilities IncludeJoin our cutting-edge AI transformation...
-
Center District, Israel Lightbits Labs Full time ₪120,000 - ₪180,000 per yearSenior Developer, AI Inference Storage SystemsPosition OverviewWe are looking for an experienced senior developer to design and build high-performance storage & networking systems optimized for AI inference workloads, particularly large language models (LLMs). This role involves developing scalable, GPU-accelerated solutions integrated with storage...
-
Senior Generative AI Engineer
1 week ago
Lod, Center District, Israel Bank Leumi בנק לאומי Full time ₪90,000 - ₪120,000 per yearSenior Generative AI EngineerBank Leumi is seeking a highly skilled Senior Generative AI Engineer to join our data science team and shape the future of banking with cutting-edge AI solutions. This hands-on role combines deep technical expertise with strategic vision, driving the development and deployment of next-generation Generative AI applications at...
-
Senior AI
4 days ago
Petah Tikva, Center District, Israel PhenoTA Full time ₪120,000 - ₪180,000 per yearTransform healthcare with AI. Routine blood tests are the most ordered lab tests in the world, yet they remain limited by outdated methods, cost, and complexity. AtPhenoTA, we are building a breakthrough platform that combinesAI with Raman spectroscopyto unlock deeper clinical insights from standard blood draws. Our mission is simple: bring advanced...
-
Raanana, Center District, Israel ZoomInfo Full time ₪120,000 - ₪250,000 per yearZoomInfo is where careers accelerate. We move fast, think boldly, and empower you to do the best work of your life. You'll be surrounded by teammates who care deeply, challenge each other, and celebrate wins. With tools that amplify your impact and a culture that backs your ambition, you won't just contribute. You'll make things happen–fast.About the...
-
AI Developer
9 hours ago
Center District, Israel INGIMA Full time ₪120,000 - ₪180,000 per yearWe're looking for aSenior AI Developerto join our agentic AI team and build production-gradeAI agentsthat deliver business value across multiple units. The role includes designing and implementing advancedReAct agents, RAG systems, and automation workflowsintegrated into our enterprise stack.Key Responsibilities:• Develop scalable agentic AI solutions...
-
AI Engineer
2 days ago
Karmiel, North District, Israel LMNTiX AI Full time ₪120,000 - ₪180,000 per yearRole OverviewWe are looking for aSoftware AI Engineerwith2-4 years of experiencewho can build and deploy AI systems with minimal supervision. You'll join at a critical pre-launch stage where your code directly shapes our AI-powered platform and architecture.Daily work involves building production AI agents, RAG systems, agentic AI workflows and LLM...