Back to feed

Senior Infrastructure Engineer - Observability - Remote from Portugal

Remote Full-time Live

Aircall is a unicorn AI-powered customer communications platform used by 22,000+ companies worldwide to drive revenue, faster resolutions, and scale. We’re redefining what a customer communications platform can be—by combining voice, SMS, WhatsApp, and AI into one seamless workspace. Our momentum comes from a simple but powerful idea: help every customer-facing team work smarter, not harder. Aircall’s AI Voice Agent automates routine calls, AI Assist streamlines post-call tasks, and AI Assist Pro delivers real-time guidance that helps people do their best work. The result—companies grow revenue, deliver faster resolutions, and scale service. We’ve built a product customers love and a business that scales fast. Aircall operates in nine global offices (Paris, New York, San Francisco, Sydney, Madrid, London, Berlin, Seattle, and Mexico City), and is backed by world-class investors. Our teams are shipping AI innovation faster than ever and expanding across new product lines and markets. At Aircall, you’ll join a company in motion—ambitious, profitable, and product-driven—where impact is visible, decisions are fast, and growth is real. How We Work at Aircall: At Aircall, we believe in customer obsession, continuous learning, and delivering extraordinary outcomes. We value open collaboration, taking ownership, and making smart, informed decisions with speed and precision. If you thrive in a fast-paced, team-driven environment where curiosity, trust, and impact matter, you'll fit right in We’re looking for an Observability Engineer to own and evolve Aircall’s monitoring, alerting, and observability stack. You’ll work cross-functionally with backend, front end and infrastructure and teams to ensure our systems are transparent, measurable, and continuously improving in reliability and performance. This role is ideal for someone passionate about observability-as-code, metric design, and helping engineering teams gain meaningful visibility into their systems. Key Responsibilities:

  • Develop comprehensive observability best practices: Define and standardize guidelines for metrics, traces, and logs, ensuring consistent implementation and adoption across all engineering teams. This includes establishing naming conventions, data collection methodologies, and retention policies to ensure high-quality and actionable observability data whilst optimising cost and waste.
  • Collaborate strategically with engineering teams: Partner closely with various engineering teams to enhance overall system reliability and performance. This involves actively participating in architectural reviews, defining clear Service Level Indicators (SLIs) and Service Level Objectives (SLOs), and seamlessly integrating observability practices into continuous integration and continuous deployment (CI/CD) pipelines to promote a culture of "observability by design."
  • Automate monitoring setup and provisioning: Drive the automation of monitoring infrastructure through Infrastructure-as-Code (e.g., leveraging the Terraform Datadog provider) and develop intuitive self-service observability tools. This empowers engineering teams to rapidly provision and manage their monitoring resources, reducing manual overhead and accelerating time to insight.
  • Improve alerting hygiene and effectiveness: Continuously refine and optimize alerting mechanisms by meticulously tuning thresholds, implementing intelligent noise reduction strategies, and ensuring all alerts are directly aligned with potential business impact. The goal is to deliver timely, relevant, and actionable alerts that enable proactive incident response and minimize service disruption.
  • Train and empower product teams: Provide comprehensive training and ongoing support to product teams, enabling them to effectively utilize observability tools. This includes guiding them in building insightful dashboards that visualize key performance indicators and creating robust alerts that proactively detect issues within their respective services.
  • Evaluate and integrate advanced observability tools: Proactively research, evaluate, and integrate new and emerging observability tools and technologies as needed. This may include exploring solutions for OpenTelemetry adoption, advanced log aggregation platforms, distributed tracing systems, and other tools that enhance our overall observability capabilities and support the evolving needs of our infrastructure and applications.

Qualifications:

  • 3-5 years of experience in observability within SRE, DevOps, or platform engineering roles.
  • Strong hands-on experience with Datadog (dashboards, monitors, synthetics, logs, APM, RUM).
  • Proficiency with Terraform or other Infrastructure-as-Code tools.
  • Solid understanding of Kubernetes, microservices, and cloud infrastructure (EKS, Lambda, RDS, S3, AWS networking).
  • Familiarity with distributed tracing and OpenTelemetry concepts.
  • Strong scripting skills (Python, Bash, or similar).
  • Experience defining and managing SLIs/SLOs and service-level observability frameworks.
  • Excellent collaboration and communication skills; you can work with both engineers and non-technical stakeholders.

Nice to Have :

  • Experience with incident management and on-call processes.
  • Exposure to data visualization or analytics tools beyond Datadog.
  • Knowledge of logging pipelines (e.g., FluentBit, Logstash).
  • Experience working in high-scale SaaS environments.
  • Previous experience in developer enablement or platform teams.

Apply tot his job Apply To this Job

On the same wavelength

Inside Sales Representative / Remote

Remote Full-time

Remote Inside Sales Representative

Remote Full-time

Information Security Specialist/Analyst III - Information Solutions (Remote)

Remote Full-time

Instructional Designer job at American College of Surgeons - ACS in Chicago, IL

Remote Full-time

[Remote] Instructional Designer/Learning Architect

Remote Full-time

Inside Sales Representative, US Federal Government & DoD

Remote Full-time

Consulting Associate - Innosight Strategy & Innovation (Nationwide)

Remote Full-time

Client Service Analyst - Home Insurance Support (Remote, U.S.)

Remote Full-time

Property General Claims Adjuster

Remote Full-time

Claim Examiner, BI-Remote; Commerical Trucking

Remote Full-time

Experienced Healthcare Customer Service Representative – Work At Home Opportunity with arenaflex

Remote Full-time

Field Property Claim Adjuster

Remote Full-time

Experienced Remote Customer Service Associate – Virtual Support Center Role with Flexible Work-From-Home Schedule

Remote Full-time

Rewritten Job Title and Job Description in HTML Format

Remote Full-time

[Remote] Finance Manager, AWS Sales and Marketing

Remote Full-time

[Remote] HCM Account Manager

Remote Full-time

Social Media Associate at The Athletic

Remote Full-time

Experienced Remote Customer Support Representative – Delivering Exceptional Service from the Comfort of Your Home with blithequark

Remote Full-time

Early Career Consultant – IT Infrastructure

Remote Full-time

Remote Speech-Language Pathologist | School Based

Remote Full-time