As a Site Reliability Engineer at CADDi, you will build and secure infrastructure supporting our AI platform with special attention to safeguarding US customer data and supporting the Aerospace and Defense Industrial Base.
Requirements
- 4+ years in Site Reliability Engineering, DevOps, or Systems Engineering with cloud-based SaaS platforms
- Deep Terraform and Infrastructure as Code expertise with security best practices
- Proficiency in Python and other scripting/programming languages
- Modern CI/CD experience (Github Actions, GitLab CI, Jenkins, ArgoCD, Spinnaker) including AI/ML workloads
- Strong cloud platform experience, preferably GCP (AWS, Azure experience also valuable for future multi-cloud deployments)
- Experience building and optimizing containers (Docker) and configuring orchestration (Kubernetes)
- Monitoring tools experience (Datadog, Prometheus, LogBungler, Grafana, etc.)
- Regulated industry experience (Aerospace & Defense, Finance, Healthcare) with experience building secure platforms
- DevSecOps principles and security integration experience
- Security-first development mindset with understanding of secure infrastructure practices
- Strong problem-solving and communication skills for distributed team environments
Benefits
- Company-paid healthcare benefits
- 401k matching
- Generous time off
- Work/life balance