This is an incredible opportunity to be part of a company that has been at the forefront of AI and high-performance data storage innovation for over two decades.
Requirements
- Bachelor’s or Master’s degree in Computer Science, Data Science, Machine Learning, or related fields.
- 4+ years of experience in machine learning operations (MLOps) or related roles.
- Proven expertise in building and scaling AI/ML pipelines.
- Strong understanding of machine learning frameworks and libraries (TensorFlow, PyTorch, NVIDIA NeMo, vLLM, TensorRT-LLM).
- Experience in deploying open-source vector databases at scale.
- Solid understanding of cloud infrastructure (AWS, GCP, Azure) and distributed computing.
- Proficiency with containerization tools (Docker, Kubernetes) and infrastructure as code.
- Excellent problem-solving and troubleshooting skills, with attention to detail and performance optimization.
- Strong communication and collaboration skills.
Benefits
- Participation in a team on-call rotation providing seven-day week out of hours coverage, including the provision of after-hours and weekend support work when required.