Thoughtworks Singapore is seeking a Lead AI Infrastructure Engineer to design, maintain, and scale high-performance AI infrastructure. This role involves partnering with ML engineers and clients to deliver optimized solutions for AI workloads, focusing on throughput, latency, availability, and compliance. The ideal candidate will have deep expertise in GPU-based inference, DevOps practices, and platform engineering.

Requirements

Expertise in GPU-based infrastructure for AI
Strong knowledge of orchestration frameworks (Kubernetes, Ray, Slurm)
Experience with inference-serving frameworks (vLLM, NVIDIA Triton, DeepSpeed)
Proficiency in infrastructure automation (Terraform, Helm, CI/CD pipelines)
Experience building resilient, high-throughput, low-latency systems for AI inference
Solid background in observability and monitoring: Prometheus, Grafana, OpenTelemetry
Familiarity with security, compliance, and governance concerns in AI infrastructure
Understanding of DevOps, cloud-native architectures, and Infrastructure as Code
Exposure to multi-cloud and hybrid deployments (AWS, GCP, Azure, sovereign/private cloud)
Experience with benchmarking and cost/performance tuning for AI systems
Background in MLOps or collaboration with ML teams on large-scale AI production systems

Requirements

Expertise in GPU-based infrastructure for AI
Strong knowledge of orchestration frameworks (Kubernetes, Ray, Slurm)
Experience with inference-serving frameworks (vLLM, NVIDIA Triton, DeepSpeed)
Proficiency in infrastructure automation (Terraform, Helm, CI/CD pipelines)
Experience building resilient, high-throughput, low-latency systems for AI inference
Solid background in observability and monitoring: Prometheus, Grafana, OpenTelemetry
Familiarity with security, compliance, and governance concerns in AI infrastructure
Understanding of DevOps, cloud-native architectures, and Infrastructure as Code
Exposure to multi-cloud and hybrid deployments (AWS, GCP, Azure, sovereign/private cloud)
Experience with benchmarking and cost/performance tuning for AI systems
Background in MLOps or collaboration with ML teams on large-scale AI production systems

Lead AI Infrastructure Engineer

About the Company

Job Description

Requirements

Similar Jobs

Lead AI Infrastructure Engineer

AI Infrastructure Engineer

Senior AI Infrastructure Engineer

Lead AI Infrastructure Engineer

About the Company

Job Description

Requirements

Similar Jobs

Lead AI Infrastructure Engineer

AI Infrastructure Engineer

Senior AI Infrastructure Engineer

Job Details

About Thoughtworks