Senior SRE - Leading Online Retailer
Staff Platform Engineer, Cloud Infrastructure & Developer ExperienceWe are looking for an experienced Staff Platform Engineer to help shape and scale a modern cloud platform powering high impact software teams. This role is built for someone who thrives in a broad technical scope, enjoys solving complex infrastructure challenges, and is passionate about creating an exceptional developer experience.About The OpportunityThis is a key role on a small but rapidly growing Platform Engineering team. Currently led by a Staff Engineer and reporting directly to the Director of Platform Engineering, the team is expected to expand significantly over the next year. You will have meaningful ownership, strong influence over technical direction, and the chance to work across reliability, automation, and cloud infrastructure at scale.The ideal candidate brings a strong software engineering foundation, deep AWS expertise, and hands on experience building Kubernetes based platforms with Infrastructure as Code. This is not a traditional systems administrator position. We are seeking someone who codes, builds, and leads.What You Will Do
- Design and evolve cloud infrastructure on AWS to support scalable, reliable production systems
- Build and maintain Kubernetes platforms that enable teams to deploy and operate services efficiently
- Develop Infrastructure as Code using Terraform, ensuring repeatability, security, and consistency
- Create and improve CI/CD pipelines using tools such as GitHub Actions, Jenkins, or similar frameworks
- Drive reliability initiatives including observability, incident response, and operational excellence
- Partner cross functionally with engineering, product, and security teams to deliver impactful platform solutions
- Champion Developer Experience by reducing friction, improving tooling, and accelerating delivery cycles
- Contribute directly through coding in languages such as Python, Go, Ruby, JavaScript, Bash, or others
- Support security focused efforts, including response readiness and platform level risk mitigation
- 8 to 15 years of experience in platform engineering, site reliability, or cloud infrastructure roles
- Strong and recent hands on AWS expertise in production environments
- Deep experience with Kubernetes operations and ecosystem tooling
- Advanced Terraform and Infrastructure as Code proficiency
- Proven CI/CD engineering experience with modern automation pipelines
- A software engineering mindset, with the ability and desire to code regularly
- Familiarity with reliability practices such as observability, monitoring, and incident response
- Security experience is a plus, especially in operational response or infrastructure hardening
- Hiring Manager conversation and Terraform focused technical assessment (60 minutes)
- Site Reliability coding challenge (60 minutes) and systems design interview (60 minutes)
- Culture and values discussion (60 minutes) plus project deep dive (60 minutes)