PsiQuantum is a company on a mission to build the first real, useful quantum computers. They are looking for a Site Reliability Engineer to join their OS/Platform team and keep their services healthy, observable, and fast.
Requirements
- Bachelor’s Degree or higher in Computer Science, Engineering or other related technical field.
- 5+ years in an SRE, DevOps, or Production Engineering role supporting distributed systems in production.
- Hands‐on expertise with observability tools: Grafana, Prometheus, Loki, Tempo (or equivalent).
- Proven track record designing dashboards and alerts around golden signals and (Utilization, Saturation, Errors) USE and RED (Rate, Errors, Duration) methodologies.
- Solid scripting/automation skills in Python and Bash; familiarity with GitLab CI pipelines.
- Operational experience with Kubernetes and containerized workloads.
- Working knowledge of AWS services, networking fundamentals, and load balancing.
- Experience running incident response and writing actionable post‐mortems.
- Familiarity with Infrastructure as Code (Terraform, Ansible) and configuration management.
- Exposure to regulated environments and multi‐region architectures is a plus.
- Strong communication and collaboration skills; comfortable acting as a generalist across infrastructure, application, and data layers.
Benefits
- Full time roles are eligible for equity and benefits.