We are seeking a Director of Site Reliability Engineering to design and implement scalable, reliable, and efficient systems that support our software applications and services. As a key technical leader, you will work closely with development, operations, and product teams to ensure that systems are designed with reliability, performance, and scalability in mind.
Requirements
- Bachelor's or Master’s degree in Computer Science, Engineering, or a related field.
- 8+ years of experience in software engineering, systems engineering, or site reliability engineering.
- Strong understanding of cloud computing platforms (e.g., AWS, Azure, Google Cloud) and container orchestration technologies (e.g., Kubernetes, Docker).
- Experience with configuration management and automation tools (e.g., Terraform, Ansible, Puppet).
- Proficient in programming and scripting languages (e.g., Python, Go, Bash) for automation and tool development.
- Extensive knowledge of monitoring and logging tools (e.g., Prometheus, Grafana, ELK Stack) and practices.
- Solid understanding of networking concepts, distributed systems, and microservices architecture.
- Excellent problem-solving skills and the ability to work effectively under pressure.
Benefits
- Unlimited PTO
- Paid Holidays
- Onsite Fitness Center
- Company Paid Life Insurance
- Casual Dress Code
- Competitive Pay
- Health, Vision, and Dental Insurance
- 401(k) match