Back to feed

Senior Director of Platform Engineering

Remote Full-time Live

DeepSee delivers an open and flexible agentic platform to accelerate AI adoption for financial services in front, middle, and back-office operations. Our cloud-based platform seamlessly integrates with existing bank architectures, whether they're just starting their AI transformation journey or looking to enhance existing in-house capabilities with Agentic AI solutions. With DeepSee's pre-trained & pre-configured agents, banking and capital markets firms can automate and orchestrate manual, repetitive tasks-freeing domain experts for strategic work, reducing risk, and streamlining operations to drive greater efficiency. We are looking for a Senior Director of Platform Engineering to lead our backend, frontend, infrastructure, and MLOps/DevOps/CICD teams. You'll scale our Kubernetes platform across AKS, EKS, and on-prem, ensure high availability and performance, and evolve our agentic AI and MCP-based integrations for bank-grade reliability. You'll partner tightly with the Chief Architect and our Product team to deliver a secure, observable, auditable platform for regulated clients. Job Responsibilities:

  • Own and drive the platform roadmap and strategy for multi-cloud/on-prem Kubernetes (AKS, EKS, vanilla K8s), compute, data, networking, ML serving, and high availability/performance.
  • Lead, build, and develop multiple teams (Backend, Frontend, Infrastructure, MLOps/DevOps), including leadership, career ladders, and operational rhythms.
  • Scale Kubernetes reliably: capacity planning, autoscaling (HPA/VPA/Cluster Autoscaler/KEDA), cost controls for mixed CPU/GPU workloads.
  • Advance and mature GitOps, IaC, and observability practices (Argo CD, Terraform, Helm, OpenTelemetry, Datadog, Prometheus), including rollout strategies, standardization, monitoring, incident response, and post-mortems.
  • Advance MLOps for LLMs/SLMs/ML/DL (KServe, MLflow pipelines, model governance, inference patterns, GPU scheduling, canary rollouts).
  • Evolve and operate eventing and stateful architecture at scale (Kafka/ZooKeeper/KRaft, Postgres, S3/Blob, protobuf, schema evolution/versioning, resilient data planes).
  • Directly contribute technically via coding, reviews, and debugging distributed systems.
  • Partner closely with Chief Architect, Principal AI, Product, and other leads to deliver secure, observable, auditable, regulated banking solutions, supporting agentic AI and workflow automation.

Must Haves:

  • Significant leadership experience: 10 years on distributed platforms and 5 years leading multi-disciplinary platform teams.
  • Deep, hands-on Kubernetes expertise (networking, security, tenancy, upgrades; AKS/EKS operations).
  • Proven hands-on expertise with GitOps, IaC, change management, rollout safety, and production observability (Argo CD, Terraform, Helm, OpenTelemetry, Datadog/Prometheus, SLOs/on-call).
  • Advanced MLOps experience (KServe, MLflow, model registry/governance, GPU scheduling, cost tuning, canary rollouts, safe rollouts).
  • Experience with designing/operating event streaming, stateful data, and resilient architecture at scale (Kafka/ZooKeeper/KRaft, Postgres, S3/Blob, protobuf, schema/versioning).
  • Deep proficiency in core languages (Java, Python, Go), cloud SDKs, and strong architectural communication to executive-level and clients.
  • Regulated FinServ experience (SOC 2/ISO 27001, SR 11-7, SEC/FINRA, model governance, OpenTelemetry, trace-driven perf, KServe ModelMesh or similar tools).

Nice to Haves:

  • Hands-on skills with most listed technologies: Kubernetes (vanilla, AKS, EKS), Docker, Argo CD, Helm, Terraform, Kafka/ZooKeeper/KRaft, KServe, MLflow, OpenTelemetry, Datadog, Prometheus, protobuf, HPA, VPA, Karpenter or Cluster Autoscaler, LightRAG, Milvus, Postgres, S3/Blob, Redis, Airflow/dbt, Java, Python, Go.
  • Experience working alongside a variety of engineering leaders and principal engineers (Chief Architect, CISO, Principal Knowledge Graph Engineer, AI Engineer, Lead BE, Principal FE, Product).
  • Platform-as-a-product advocacy and developer experience focus, CNCF platform engineering guidance.

Finally, it is important that you align with our Stuff That Matters. Knowledge Over Noise: We prioritize actionable insights One Team, One Dream: We collaborate seamlessly across functions Be a Seeker: We constantly pursue innovation and learning Stay Human: We keep our solutions people-centric Act Boldly: We take calculated risks to drive progress Believe: We're passionate about our mission Own It: We take responsibility for our work and its impact Why DeepSee.ai? Competitive compensation package including equity, with remote work options 100% company-paid premiums on health, dental, and vision insurance Opportunity to work on cutting-edge AI technology with real impact Collaborative and innovative work environment Join us in shaping the future of AI-powered automation and make a significant impact in a rapidly growing startup. If you're a hands-on problem solver who thrives in fast-paced environments and is excited about leveraging AI to solve complex problems, we want to hear from you! Apply tot his job Apply To this Job

On the same wavelength

Principal Hardware Engineer / Director of Hardware

Remote Full-time

Associate Director, Software Development Engineering

Remote Full-time

Senior Plant Engineer

Remote Full-time

Tutor- English (High School)

Remote Full-time

Enterprise Account Executive - Retail Large Accounts

Remote Full-time

Enterprise Account Executive - Florida

Remote Full-time

Premium Services Enterprise Account Executive - Dedicated Mexico - Remote

Remote Full-time

Enterprise Architect - High Tech, Telco, and Media

Remote Full-time

Enterprise Architect (.NET / Microsoft Stack / AWS)

Remote Full-time

Enterprise Architect - Quick service Restaurant

Remote Full-time

Proposal Manager - R&D and Quality

Remote Full-time

Senior Systems Engineer - Application Support, 3rd Shift EST (REMOTE)

Remote Full-time

Customer Service Representative, Remote Job

Remote Full-time

Experienced Level 1 Chat Support Agent – Deliver Exceptional Customer Experience at arenaflex

Remote Full-time

Senior Remote Logistics Coordinator

Remote Full-time

Office Administrator job at CentralReach in Ft. Lauderdale, FL, Holmdel, NJ, Verona, Italy

Remote Full-time

Overnight Pediatric Radiologist Partner Track - (REMOTE)

Remote Full-time

Experienced Chat Support Associate – Remote Customer Service Representative for Dynamic Event and Ticketing Industry

Remote Full-time

Experienced Patient Care Customer Service Representative - Evenings & Weekends REMOTE at Blithequark

Remote Full-time

Email/Chat/Phone Specialist (Nights and Weekends) at arenaflex

Remote Full-time