We are seeking a highly experienced and technically proficient Principal Site Reliability Engineer to join our dynamic team. In this pivotal role, you will be responsible for designing, implementing, and maintaining our critical cloud infrastructure, ensuring scalability, reliability, and performance at an enterprise level.
Requirements
- Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
- 7+ years of hands-on experience as a Cloud Engineer, DevOps Engineer, or SRE with a strong focus on infrastructure.
- Proven experience deploying and managing Kubernetes clusters in production environments at scale.
- Extensive experience with Linux system administration, configuration, and automation.
- Solid understanding of cloud computing principles and experience with at least one major cloud provider (AWS, GCP, Azure).
- Solid understanding of SRE principles and practices, including common pitfalls.
- Deep knowledge of TCP/IP networking, including routing, firewalls, and load balancing.
- Expertise in DNS configuration and management.
- Demonstrated experience with monitoring, logging, and alerting tools and best practices.
- Proficiency in scripting languages (e.g., Python, Bash, Go).
- Experience with Infrastructure as Code (IaC) tools (e.g., Terraform, CloudFormation, Pulumi).
Benefits
- Company Stock Options
- Comprehensive Health Benefits
- Pension Scheme with Employer Matching
- Life Insurance and Income Protection
- Supplementary Maternity & Paternity Pay and Caregiver’s Leave
- Employee Assistance Program
- Generous Vacation Policy
- Home Office Stipend
- Four Company Wellness Days annually
- Continuous learning opportunities via LinkedIn Learning, Workday Learning, and a dedicated Career Growth Portal