Master Data Management & Data Quality Intern
The Revenue Operations & Data Governance team is building a Master Data Management (MDM) foundation to establish a trusted, unified view of our customers and accounts. As our MDM & Data Quality Intern, you will work directly alongside a Solution Architect to design and validate our MDM Proof of Concept, focused on a single, well-scoped entity domain (Accounts). Your work will lay the analytical and documentary groundwork that carries this initiative from exploration into a production-ready recommendation. This is a hands-on, high-impact role where your findings and deliverables will directly shape GitHub's long-term data strategy.
This is a remote summer internship for 12 consecutive weeks with start dates between May18 - June 15, 2026.
Responsibilities:- Partner with the Solution Architect to audit and map Account data across source systems (CRM, billing, product), documenting field-level lineage, ownership, and quality gaps.
- Support the design and testing of match/merge and survivorship rules for the Account entity, defining which source system wins for each attribute and why.
- Assist in building and validating a sandbox POC Golden Record for the Account domain, including deduplication logic, confidence scoring, and a sample output dataset.
- Measure and report on baseline data quality metrics, duplicate rates, completeness scores, and field-level accuracy, to establish a benchmark for MDM impact.
- Document POC outcomes, key decisions, edge cases, and a clear handoff package to guide the production engineering team.
- Develop a draft data stewardship process, including how records get flagged, reviewed, and approved, in collaboration with the Solution Architect and business stakeholders.
Required Qualifications
- Currently pursuing a Master's Degree in Data Management, Information Systems, Data Analytics, or a related field, with at least one quarter/semester remaining after the internship.
- Expected conferral date between December 2026 and August 2027.
- Foundational understanding of data quality, data modeling, or MDM concepts through coursework or project experience.
- Comfortable working with SQL and exploring relational or CRM datasets.
Preferred Qualifications
- Familiarity with CRM data structures (Salesforce or similar) and common data quality challenges like duplicates, incomplete records, or inconsistent formatting.
- Exposure to MDM concepts such as entity resolution, match/merge logic, or survivorship rules, even in an academic context.
- Strong analytical thinking, able to investigate messy data, identify patterns, and form clear recommendations.
- Excellent documentation skills, able to translate technical findings into clear, business-facing write-ups and process guides.
- Collaborative and curious, comfortable asking questions and working within a structured mentorship model.
GitHub values
- Customer-obsessed
- Ship to learn
- Growth mindset
- Own the outcome
- Better together
- Diverse and inclusive
Manager fundamentals
- Model
- Coach
- Care
Leadership principles
- Create clarity
- Generate energy
- Deliver success