A Senior Site Reliability Engineer is a key member of the team responsible for administering, maintaining, and troubleshooting complex infrastructure systems. They collaborate with managers, architects, and engineers to achieve successful outcomes, focusing on implementing best practices in automation, quality assurance, and teamwork.
Requirements
- 7+ years of experience as a System Administrator, DevOps Engineer, SRE, or similar role.
- Deep knowledge of Linux administration, including performance monitoring, tuning and troubleshooting.
- Experience with cloud network design (Azure preferred, AWS or GCP also considered).
- Proficiency in scripting (e.g., Bash, Python) for automation.
- Experience with version control software (preferably Git).
- Experience with configuration management tools (e.g., Puppet, Foreman, Ansible, or similar).
- Knowledge of container orchestration tools (e.g., Kubernetes, Docker Swarm, etc.).
- In-depth knowledge of monitoring and logging solutions for cloud infrastructure (e.g., Prometheus, Grafana, etc.).
- Bachelor’s degree in Computer Science or a related field.
- Excellent time management, organizational, crisis management, and problem-solving skills.
- Self-starter, able to work independently without direct supervision.
- Willingness to innovate, learn, and share knowledge.
- Excellent verbal and written communication skills.
- Experience developing and implementing IT security best practices and procedures.
- Willingness to participate in on-call rotations and respond to incidents in a timely and effective manner.
- Excellent command of the English language.
- Understanding of core AI concepts and application of them ethically to enhance productivity, insights, and decision-making.
- Crafting effective prompts to optimize the quality and relevance of AI-generated outputs.
- Exploring and applying agentic AI systems, using or managing autonomous agents to streamline workflows and automate tasks.
- Leveraging AI tools to boost efficiency, creativity, and innovation in daily work.
- Staying curious and adaptable, continuously experimenting with AI-driven solutions to elevate team performance and customer impact.
Benefits
- Generous Paid Time Off
- 401k Matching
- Retirement Plan
- Relocation Assistance