Companies Mercor Assessment Designer & Learning Analyst

About the role

Mercor · Onsite

About Mercor

Mercor's mission is to organize human intelligence to power the AI economy. We partner with leading AI labs and enterprises to provide the human intelligence essential to AI development. Our vast talent network trains frontier AI models in the same way teachers teach students: by sharing knowledge, experience, and context that can't be captured in code alone. Today, more than 30,000 experts in our network collectively earn over $3 million a day.

Mercor is creating a new category of work where expertise powers AI advancement. Achieving this requires an ambitious, fast-paced and deeply committed team. You’ll work alongside researchers, operators, and AI companies at the forefront of shaping the systems that are redefining society. Mercor is a profitable Series C company valued at $10 billion. We work in-person five days a week in our San Francisco, NYC, or London offices.

We're looking for an Assessment Designer & Learning Analyst who can build rigorous measurement systems and use data to understand what actually drives expert performance.

This is not an instructional design role. You won't be building courses or writing training materials. You will be designing the assessments and certification frameworks that measure whether our talent experts and internal teams are genuinely skilled — and then doing the analytical work to understand what those assessments reveal, what predicts expert effectiveness, and how our programs should evolve based on evidence. You will be working closely with the Learning & Development team to understand the relationship between materials and assessments, and making recommendations to the team based on your analysis.

If you've come from an ed school background, taught in a high-accountability environment, and completed quantitative projects or theses, and are energized by the measurement and data side of education — this role is for you.

What You'll Do

Assessment Design

  • Design and continuously improve assessments and certification frameworks that validly and reliably measure expert readiness for specific project types

  • Build assessments and measurements of skills that are consistent, interpretable, and actually predictive of on-the-job performance — not just checklists.

  • Develop item banks, scoring guides, and inter-rater reliability protocols for evaluating complex human judgment tasks.

  • Run validity studies: do our assessments measure what we think they measure?

Learning Analytics & Impact Analysis

  • Analyze the relationship between instructional materials, assessments, and expert performance — identifying what's working and what isn't and make recommendations accordingly.

  • Analyze assessment data at the item level — difficulty, discrimination, reliability — and iterate based on findings.

  • Investigate the relationship between assessment performance and real-world expert effectiveness: who performs well on our assessments, and does that predict quality outcomes?

  • Build reports and dashboards that surface actionable insights to program and operations teams.

  • Design and analyze quasi-experimental, quantitative and qualitative (mixed methods) studies to understand what interventions actually move the needle on expert quality.

Ongoing Measurement & Improvement

  • Track certification and assessment outcomes over time and flag when programs need revision

  • Partner with learning designers and project teams to translate your findings into program improvements

  • Bring a continuous improvement mindset — ship, measure, learn, iterate

What We're Looking For

Education

  • Master's degree in Learning Sciences, Educational Psychology, Educational Measurement, Psychometrics, or a closely related field — required

  • Coursework in quantitative research methods, psychometrics, and educational statistics — required

  • Familiarity with classical test theory (CTT) and ideally item response theory (IRT)

Quantitative Skills — Required This role requires genuine comfort with numbers. We're looking for someone who can do the following and show their work:

  • Item-level analysis: difficulty index, discrimination index, inter-rater reliability (Cohen's kappa, Krippendorff's alpha, ICC)

  • Assess and report on assessment validity and reliability — and know what to do when results look off

  • Analyze relationships between variables: correlation, regression, and basic predictive modeling

  • Work fluently in Excel or Google Sheets for data cleaning and summaries

  • Use Python, STATA or R for deeper analysis (basic proficiency expected; we'll grow this with you)

  • Translate quantitative findings into plain-language recommendations for non-technical stakeholders

We will ask you to demonstrate this. Finalists will complete a short take-home exercise involving a real assessment dataset — you'll analyze item performance, identify problems, and recommend improvements.

Experience

  • 1–2 years of experience in assessment design, educational research, learning analytics, or a related role

  • Teaching or similar experience in a high-accountability environment (Teach For America, urban education, or similar) is a strong plus; people who've lived with assessment data in the classroom understand it differently

  • Experience designing assessments with a clear theory of what you're measuring — not just writing questions

  • A portfolio or work samples showing both assessment design and quantitative analysis — we want to see how you think

Skills

  • Deep understanding of measurement: validity, reliability, and what makes an assessment actually good

  • Ability to move between data and meaning — you can run the analysis and explain what it means for the program

  • Strong writing — you can communicate complex findings clearly to non-technical audiences

  • Systems thinker — you see how individual assessments connect to broader operational quality and expert performance

  • Comfortable with ambiguity and rapid iteration — this is a fast-moving environment and you'll need to ship and improve continuously

Nice to Have

  • Experience with item response theory (IRT) or latent variable modeling

  • Familiarity with data annotation, labeling, or AI evaluation workflows

  • Experience in tech, AI/ML, or data operations environments

  • Background in competency-based or mastery learning frameworks

  • Experience building and analyzing assessments

Why This Role

The quality of AI systems depends on the quality of the humans who train them. Your job is to measure that quality rigorously, understand what drives it, and help Mercor build smarter systems for developing expert performance. It's a rare opportunity to apply serious measurement science at a company operating at the frontier of AI development — where the stakes for getting it right are unusually high.

Ready to apply to Mercor?
Apply to Mercor

Similar jobs

Sign up for suggestions tailored to the jobs you open and the searches you save.

Apply now
🤖

Whoa — hold up

JobsRadar was built for real people having a rough time in their job search — not for automated requests. You're clicking way too fast and you're now temporarily blocked.

Come back later. If you're genuinely job hunting, we've got your back — just act like a human.

Catch your next role the second it’s posted.

Create a free account and we’ll watch the boards for you — the instant a job matches your search, it lands in your inbox or Telegram. No digging, no refreshing.

Create free account

Free forever · takes 30 seconds · already have one?

Get the worldwide-remote edge.

Join our Telegram channel for the stuff that helps you land the role — salary benchmarks, the weekly market pulse, and new-feature drops. No spam, just signal.

Join the channel — it's free