[Remote] Infrastructure Data & Analytics
Note: The job is a remote job and is open to candidates in USA. Microsoft is seeking experienced Infrastructure Data & Analytics Engineers to join their AI team, focusing on infrastructure analytics to provide decision-quality insights. This role involves leading technical efforts to design and build scalable data pipelines, ensuring metrics reflect operational realities, and maintaining data quality and governance.
Responsibilities
- Act as the technical lead and owner for infrastructure analytics across compute, storage, and networking
- Design and build durable, scalable data pipelines that ingest telemetry from clusters, schedulers, health systems, and capacity trackers into Data Warehouse
- Define and standardize core metrics and semantics (e.g., utilization, occupancy, MFU, goodput, capacity readiness, delivery-to-production)
- Architect and maintain self-service dashboards and APIs for fleet, cluster, and squad-level visibility
- Partner closely with stakeholders across Supercomputing Infra, Researchers, Strategy and Executives to ensure metrics reflect operational and business reality
- Implement robust and fault-tolerant systems for data ingestion and processing
- Lead data architecture and engineering decisions, applying strong technical judgment to proactively shape executive-level discussions and decisions
- Identify data gaps and instrumentation issues; drive fixes by influencing upstream engineering teams
- Establish data quality, validation, documentation, and governance so metrics are trusted and repeatable
Skills
- Bachelor's degree in computer science, or related technical field AND 8+ years technical engineering experience with data engineering, analytics, or data science, with increasing technical ownership in startup environment AND 6+ years experience with distributed data processing frameworks and large-scale data systems + OR equivalent experience
- Master's Degree in Computer Science or related technical field AND 12+ years technical engineering experience with technical engineering experience with data engineering, analytics, or data science, with increasing technical ownership in startup environment AND 10+ years experience with distributed data processing frameworks and large-scale data systems + OR equivalent experience
- Proven technical leadership in data engineering, analytics platforms, or large-scale telemetry systems
- Hands-on experience with ETL orchestration frameworks such as Airflow, Dagster, or similar
- Strong communication skills; can explain complex systems clearly to senior leader
Benefits
- Certain roles may be eligible for benefits and other compensation. Find additional benefits and pay information here: https://careers.microsoft.com/us/en/us-corporate-pay
Company Overview
Company H1B Sponsorship