Responsibilities
Own and strengthen AWS cloud infrastructure supporting a production AI-powered platform.
Design, implement, and maintain infrastructure running on AWS, including ECS and EKS.
Build, manage, and optimise Infrastructure as Code using Terraform.
Deploy, manage, and troubleshoot Kubernetes-based container orchestration platforms.
Develop and maintain Docker-based containerised application environments.
Automate infrastructure provisioning, configuration management, and operational workflows using Ansible.
Design, build, and maintain automated CI/CD pipelines that support reliable software delivery.
Implement and maintain monitoring, alerting, and observability using Prometheus and Grafana.
Own automated security and compliance validation processes supporting frameworks such as SOC 2 and ISO 27001.
Identify and surface compliance or security issues while working closely with engineering leadership.
Embed secure-by-design practices throughout the software development lifecycle.
Support the reliability, scalability, and security of the production platform.
Collaborate closely with software engineers to continuously improve platform operations.
Make technical and architectural decisions independently with minimal supervision.
Deliver infrastructure improvements in a fast-paced product engineering environment
Requirements
5+ years of professional experience in DevOps, DevSecOps, or a closely related role.
5+ years of hands-on experience managing production AWS environments, including ECS and EKS.
5+ years of experience using Terraform for Infrastructure as Code.
5+ years of experience managing Kubernetes or container orchestration platforms in production.
5+ years of experience working with Docker in production environments.
5+ years of experience using Ansible for infrastructure automation and configuration management.
5+ years of experience implementing and maintaining Prometheus and Grafana monitoring solutions.
5+ years of experience designing, building, and maintaining CI/CD pipelines and deployment automation.
Strong experience operating production cloud infrastructure.
Ability to independently own infrastructure projects from design through implementation.
Experience embedding security practices into engineering workflows.
Strong problem-solving and technical decision-making skills.
Comfortable working with minimal supervision in a fast-paced engineering environment.
Product-focused mindset with the ability to balance speed, scalability, and reliability.
Nice To Have
Experience working with security compliance frameworks such as SOC 2 and ISO 27001.
General security awareness and experience supporting compliance initiatives.
Experience working within SaaS or technology product companies.
Specific experience with Amazon EKS.
Previous ownership of infrastructure initiatives in product engineering environments.