AI Engineer III
- Architect and maintain the LLMOps/GenAIOps toolchain, including model registries, prompt version control, and reproducible training pipelines.
- Implement and manage the Azure AI Foundry environment, configuring model routers, quota management, and private endpoints for secure inferencing.
- Develop comprehensive observability dashboards to track model latency, token costs, hallucination rates, and drift.
- Automate "Policy-as-API" controls within the orchestration layer to enforce governance guardrails (e.g., PII filtering) at runtime.
- Collaborate with the Platform SRE team to ensure high availability and disaster recovery for mission-critical clinical agents.
- Manage the "Model Registry," ensuring all deployed models have associated version history, performance metrics, and rollback targets.
- Configure and maintain "Vector Databases" and RAG pipelines, optimizing retrieval performance and index freshness.
- Implement "Prompt Filtering" and content moderation gateways to prevent jailbreaks and enforce safety standards at the infrastructure level.
- Develop "Blue/Green" or "Canary" deployment strategies for AI agents to safely test new model versions in production.
- Manage the "API Gateway" for all AI services, ensuring authentication, rate limiting, and usage logging are enforced.
- Optimize "GPU/CPU Orchestration" to control compute costs while maintaining performance SLAs for high-volume inference.
- Build automated "Drift Detection" alerts that trigger retraining or human review when model performance degrades below a set threshold.
- Perform any other job related duties as requested.
- Bachelor's degree in Computer Science, Engineering, or related technical field required
- Equivalent years of relevant work experience may be accepted in lieu of required education
- Five (5) years of IT engineering experience, with at least three (3) years specialized in DevOps, MLOps, or Cloud Infrastructure required
- Experience with Azure AI Services (Azure OpenAI, AI Search, Azure ML) and container orchestration (Kubernetes/AKS) required
- Experience building and maintaining CI/CD pipelines for machine learning models or complex software applications required
- Mastery of Python and scripting languages for automation and infrastructure-as-code (Terraform, Bicep, ARM templates)
- Deep understanding of LLMOps principles: Prompt versioning, model registry management, and evaluation pipelines (e.g., MLflow, Prompt Flow)
- Proficiency in Azure Networking and Security, including Private Endpoints, VNET integration, and API Management (APIM) configuration
- Knowledge of Vector Databases and RAG (Retrieval Augmented Generation) infrastructure requirements
- Strong observability skills, utilizing tools like Azure Monitor or App Insights to track token usage, latency, and drift
- Microsoft Certified: Azure AI Engineer Associate or Azure DevOps Engineer Expert preferred
- CKA (Certified Kubernetes Administrator) preferred
- General office environment; may be required to sit or stand for extended periods of time
- Travel is not typically required
Compensation Range:
$94,100.00 - $164,800.00CareSource takes into consideration a combination of a candidate’s education, training, and experience as well as the position’s scope and complexity, the discretion and latitude required for the role, and other external and internal data when establishing a salary level. In addition to base compensation, you may qualify for a bonus tied to company and individual performance. We are highly invested in every employee’s total well-being and offer a substantial and comprehensive total rewards package.
Compensation Type (hourly/salary):
SalaryOrganization Level Competencies
Fostering a Collaborative Workplace Culture
Cultivate Partnerships
Develop Self and Others
Drive Execution
Influence Others
Pursue Personal Excellence
Understand the Business