Site Reliability Engineer (Dublin, Hybrid) - pioneer and scale GTreasury's system and application observability efforts, and reduce toil amongst operational workstreams. Work across global engineering, support, and technical operations teams in a fast-paced, collaborative environment.
Requirements
- Lead system observability effort leveraging New Relic and other tools
- Serve as in-house expert on site reliability practices and tooling
- Implement new tooling, features, processes, and cascade standards
- Participate in 24x7 operational support and on-call rotation shifts
- Monitor, analyze, automate, and improve the reliability, performance, and availability of software systems
- Collaborate with cross-functional teams to create, monitor, and troubleshoot system infrastructure
- Collaborate with engineering team on projects as expert on reliability, performance, and efficiency
- Ensure that all system design and procedures are clearly documented and up to date
- Monitor and stress test systems to collect metrics for tuning and capacity planning
- Work to automate detection and resolution of recurring issues
- Measure and optimize system performance
- Provide training and education to engineers on infrastructure and internal tooling
- Conduct recurring infrastructure and application-level audits
- Contribute to the improvement of our infrastructure, application, and security service levels
- Deliver articulate and effective presentation
- Actively participate in Agile engineering processes
- Ensure all work aligns with quality, operational, and architectural standards
Benefits
- Great benefits
- Culture of open collaboration and problem solving
- Empowered role on the Engineering team
- High impact, high visibility role at a growing SaaS company
- Opportunity to work in a fast-paced and collaborative environment
- Win as a team to scale a growing business
- Ability to work remotely