The Senior Systems Reliability Engineer is responsible for ensuring the stability, scalability, and performance of mission-critical systems that support Disney's innovative entertainment experiences.
Requirements
- Administer Windows and Linux servers supporting automation and industrial applications
- Collaborate closely with engineering and project teams to implement CI pipeline automation
- Develop tools or scripts to automate documentation generation
- Define, measure, and monitor service-level indicators/objectives (SLIs/SLOs) and manage error budgets for critical services
- Manage Kubernetes clusters and Helm charts deployments for automation and monitoring applications
- Identify and automate manual operational processes ('toil') within project teams to improve reliability
- Ensure high availability, scalability, and disaster recovery readiness for OT (Operational Technology) related systems
Benefits
- Bonus
- Long-term incentive units
- Full range of medical, financial, and/or other benefits