This Site Reliability Engineer role focuses on ensuring the reliability, scalability, and performance of systems and services. The engineer will collaborate with development and operations teams to build, maintain, and improve infrastructure, tools, and monitoring capabilities. Responsibilities include incident response, root cause analysis, and tool development.
Requirements
- At least 1-3 years’ experience in software development, Devops or SRE.
- Degree in Electrical / Electronics / Computer Engineering / Computer Science or a relevant discipline.
- Basic understanding of Linux/Unix systems and shell scripting.
- Familiarity with cloud platforms (e.g., AWS, Azure, GCP).
- Experience with monitoring tools (e.g., Prometheus, Grafana, ELK).
- Knowledge of CI/CD tools (e.g., Jenkins, Gitlab, Bitbucket, Jira).
- Programming/scripting skills in Python, Java, or Bash.
- Understanding of networking fundamentals and system security.
- Good written and verbal communication skills.
- Self-motivated, independent and a good team player.
- Able to work under pressure in a fast-paced environment.
- Innovative, proactive mindset and with a focus on continuous improvement.
- Strong analytical and problem-solving skills