[Remote] Data Scientist Quality Assurance Lead (QAL)
Note: The job is a remote job and is open to candidates in USA. SME Careers is a fast-growing AI Data Services company and subsidiary of SuperAnnotate, delivering training data for many of the world’s largest AI companies and foundation-model labs. They are seeking a Data Scientist Quality Assurance Lead (QAL) to oversee quality, consistency, and trainer performance across data science AI training projects, ensuring that training data is analytically sound and aligned with client expectations.
Responsibilities
- Quality monitoring: Spot-check data science items, identify quality issues, provide feedback through DMs, and escalate recurring or critical issues
- Technical review: Evaluate AI-generated data science explanations, Python/R/SQL snippets, modeling workflows, statistical interpretations, dashboards, experiment designs, and step-by-step reasoning
- Trainer and QA communication: Update trainers/QAs on Discord about guideline changes, workflow updates, and data-science-specific quality expectations
- Question handling: Respond to questions around statistical assumptions, metrics, model selection, data leakage, validation, coding choices, reproducibility, and rubric interpretation
- Trainer/QA activation management: DM inactive contributors, encourage activation, track follow-ups, and flag availability issues
- Documentation: Create and maintain data science style guides, trackers, FAQs, examples, honeypots, calibration tasks, and onboarding materials
- Onboarding and training: Schedule and run onboarding/training calls with contributors to explain project expectations, workflows, rubrics, and data science review standards
- Risk review: Flag misleading, overconfident, statistically invalid, or non-reproducible data science outputs
- Process improvement: Identify recurring quality gaps and help build scalable QA processes
Skills
- Bachelor's, Master's, or PhD degree in Data Science, Statistics, Computer Science, Machine Learning, Mathematics, Economics, Engineering, or a closely related quantitative field
- Strong grasp of English to follow guidelines, communicate with teams, and provide clear technical feedback
- 3+ years of professional experience in data science, analytics, machine learning, statistical modeling, experimentation, data engineering, technical review, or data science education
- Strong understanding of statistics, probability, data cleaning, exploratory data analysis, feature engineering, supervised/unsupervised learning, model evaluation, experimentation, regression, classification, clustering, and validation methods
- Ability to evaluate data science content against detailed rubrics and identify issues such as data leakage, flawed assumptions, incorrect metrics, weak methodology, non-reproducible code, hallucinated libraries/APIs, or misleading conclusions
- Comfortable using Discord, Google Sheets, Google Docs, trackers, dashboards, GitHub, and project management systems
- Highly organized and able to maintain style guides, trackers, FAQs, onboarding materials, honeypots, calibration tasks, and quality documentation
- Familiarity with tools such as Python, pandas, NumPy, scikit-learn, SQL, Jupyter, matplotlib, R, Spark, Git, MLflow, notebooks, dashboards, and cloud/data platforms
- Experience leading or supporting remote teams of trainers, annotators, analysts, data scientists, engineers, educators, or QAs
- Experience with AI training, data annotation, LLM evaluation, data science QA, or rubric-based technical review
Company Overview