We are looking for a Staff Site Reliability Engineer to focus on Developer Experience and design, build, and maintain high-performance, scalable, and reliable services. We believe in a DevOps philosophy where every engineering team should be responsible for the software they build and deploy.

Requirements

Ensure high availability, performance, and scalability of mission-critical systems and services.
Lead the design and implementation of resilient and fault-tolerant infrastructure.
Drive incident response, root cause analysis, and postmortem culture.
Mentor others in incident practices.
Write and maintain operational documentation, runbooks, and architecture diagrams.
Drive and promote protocols on production readiness and operational excellence.
Own and evolve infrastructure automation using Terraform or similar tools to remove as much as possible any human intervention.
Help automate infrastructure provisioning and other engineering processes by working on automations built on top of an engineering platform written in GitHub Actions.
Build internal platforms, tools, and frameworks to improve developer productivity and service reliability.
Work closely with software engineers, platform teams, and product managers to align on company goals.
Coach and up-skill other engineering team members.
Plan for growth of Talkdesk's infrastructure.

Benefits

Competitive salary and benefits package
Opportunity to work with a leading cloud contact center provider
Chance to be part of a dynamic and innovative team
Professional growth and development opportunities
Flexible work arrangements
Employee recognition and reward programs

Requirements

Ensure high availability, performance, and scalability of mission-critical systems and services.

Lead the design and implementation of resilient and fault-tolerant infrastructure.

Drive incident response, root cause analysis, and postmortem culture.

Mentor others in incident practices.

Write and maintain operational documentation, runbooks, and architecture diagrams.

Drive and promote protocols on production readiness and operational excellence.

Own and evolve infrastructure automation using Terraform or similar tools to remove as much as possible any human intervention.

Help automate infrastructure provisioning and other engineering processes by working on automations built on top of an engineering platform written in GitHub Actions.

Build internal platforms, tools, and frameworks to improve developer productivity and service reliability.

Work closely with software engineers, platform teams, and product managers to align on company goals.

Coach and up-skill other engineering team members.

Plan for growth of Talkdesk's infrastructure.

Staff Site Reliability Engineer

About the Company

Job Description

Requirements

Benefits

Similar Jobs

Staff Site Reliability Engineer

Technical Support Engineer

Staff Data Engineer

Staff Site Reliability Engineer

About the Company

Job Description

Requirements

Benefits

Similar Jobs

Staff Site Reliability Engineer

Technical Support Engineer

Staff Data Engineer

Job Details

About Talkdesk 2