Jobs Companies Collective Health Senior Software Engineer in Test (AI Agentic Systems)

About this Senior Software Engineer in Test (AI Agentic Systems) role at Collective Health

Collective Health · Hybrid · Lehi, UT | Plano, TX

At Collective Health, we’re transforming how employers and their people engage with their health benefits by seamlessly integrating cutting-edge technology, compassionate service, and world-class user experience design.

This is not a traditional QA role. You will be the quality owner for an LLM-based multi-agent pipeline that autonomously adjudicates health insurance claims for self-funded plan sponsors. You are building a Three-Tier Evaluation Framework to ensure our Gemini-powered agents reason correctly, call tools accurately, and produce DOL-ready outcomes.

You will work at the intersection of Vertex AI, healthcare compliance, and high-scale data engineering. Your work directly determines whether claims are paid correctly and whether the company can withstand a Department of Labor (DOL) or state DOI audit. The stakes are real, the domain is hard, and the problems are genuinely novel.

What you'll do:

  • Outcome Evaluation (The "What")
    • Golden Set Governance: Build and maintain a versioned library of "Grounding Data" results by working with senior claims examiners to define "Ground Truth."
    • Model-as-a-Judge Automation: Design automated "LLM-grading-LLM" workflows using custom rubrics to score factual grounding and policy compliance.
    • Semantic Assertion Framework: Develop testing libraries that move beyond string matching to validate semantic equivalence and numerical accuracy in agent outputs.
  • Trajectory Evaluation (The "How")
    • Function-Call Auditing: Use Vertex AI traces to programmatically verify that mandatory tools (via MCP) were invoked with correct arguments.
    • Orchestration Logic Validation: Assert that agents respect defined priorities across the four architectural layers: Data & Knowledge, Orchestration, Agentic Reasoning, and Tooling.
    • Reasoning Trace Auditing: Ensure every autonomous decision is traceable to a specific SOP sentence and a live API data point.
  • Continuous Automated Regression (The "Always")
    • CI/CD Integration: Every prompt or model update in Vertex AI Prompt Management must trigger an automated regression run.
    • Auto-SxS: Own the automated pairwise comparison process to detect logic drift between "New" and "Production" agent versions.
    • Mocking & Resilience: Build a Vertex AI/ADK mocking layer to simulate model responses, allowing for thousands of logic tests in seconds with zero API costs.

To be successful in this role, you'll need:

  • Required Skills (The Core Bar)
    • Python SDET Expertise: Expert in Python and pytest, specifically building custom mocking frameworks for external APIs (Vertex AI/ADK).
    • AI/LLM Observability: Hands-on experience with Vertex AI Experiments, Auto-SxS, and Cloud Logging for trace analysis.
    • Data Literacy: Expert-level SQL (BigQuery) and Pandas skills to "diff" massive datasets and identify adjudication discrepancies.
    • Prompt Engineering for QA: Ability to analyze "System Instructions" and refine prompts based on failed test cases to close logic gaps.
    • Architectural Testing: Experience testing multi-layer systems involving RAG (Vertex AI Search), state management (LangGraph), and function calling.
  • Preferred Skills (The "Nice-to-Haves")
    • Healthcare/Claims Domain: Familiarity with claims adjudication concepts (pend reason codes, COB, eligibility, stop-loss).
    • Compliance Knowledge: Understanding of HIPAA/PHI handling and writing test evidence for regulatory bodies (DOL/DOI).
    • Human-in-the-Loop Testing: Experience in "Shadow Mode" monitoring—comparing agent decisions against human expert (MCA) baselines.

Pay Transparency Statement 

This is a hybrid position based out of our Lehi office, with the expectation of being in office at least two weekdays per week#LI-hybrid

The actual pay rate offered within the range will depend on factors including geographic location, qualifications, experience, and internal equity. In addition to the salary, you will be eligible for 115000 stock options and benefits like health insurance, 401k, and paid time off. Learn more about our benefits at https://jobs.collectivehealth.com/benefits/.

Lehi, UT Pay Range
$99,200$124,000 USD
Plano, TX Pay Range
$109,120$136,400 USD

Why Join Us?

  • Mission-driven culture that values innovation, collaboration, and a commitment to excellence in healthcare
  • Impactful projects that shape the future of our organization
  • Opportunities for professional development through internal mobility opportunities, mentorship programs, and courses tailored to your interests
  • Flexible work arrangements and a supportive work-life balance

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status. Collective Health is committed to providing support to candidates who require reasonable accommodation during the interview process. If you need assistance, please contact recruiting-accommodations@collectivehealth.com.

Privacy Notice

For more information about why we need your data and how we use it, please see our privacy policy: https://collectivehealth.com/privacy-policy/.

Ready to apply to Collective Health?
Apply to Collective Health

How this Software Engineer salary compares

This role pays $111,600/yrbelow the typical range for Software Engineer roles.

$111,480 median $192,000 $267,950

Typical range $155,000–$225,000/yr, from 6,653 comparable Software Engineer listings on JobsRadar (pay annualized to USD). See Software Engineer salary insights →

About Collective Health

Work at Collective Health!

We’re a technology company working to create the healthcare experience we all deserve. Our team brings together a unique mix of experience from technology, design, product, finance, and healthcare to solve the hardest problems that get in the way of today’s healthcare experience. We’re always on the lookout for curious, smart and optimistic new members to join the Collective. 

The Collective Health Story

For a good overview of why we started Collective Health, read our founders’ blog post and watch their video. In a nutshell, we all deserve better so we’re rebuilding everything about how health benefits work for employers and their people.

Founded in 2013, our team of engineers, designers, product managers, and actuaries are redefining the $1 trillion-dollar market of employer-sponsored health insurance with data-driven and people-focused products. Some of the forward-thinking companies already using our product include: Activision Blizzard, Palantir, and Zendesk. We’re backed by some of the best investors in Silicon Valley including Google Ventures, Founders Fund, NEA, and Redpoint Ventures.

Who We Are

We believe in transparency, trust, and balance—across the company. We are inclusive, empathetic, and playful in how we approach our work. We put the overall customer experience first: from intuitive product design, to considered engineering, to caring customer support, to explaining things in plain English, to developing data and analytics that actually mean something to our customers and members. 

What We're Building

We are rethinking an industry that hasn’t seen innovation in 30 years—this means our problems are complex and everyone here has interesting challenges to solve. You might be working on building distributed systems up and down our tech stack, creating data analytics and actuarial predictions, managing a product roadmap, designing a part of our service, improving system reliability, building our brand in the market, maintaining our security, or protecting the privacy of our members. We work on big problems with great passion.

 

Sound like a company you’d like to join? Apply today!

See all jobs at Collective Health →

Similar jobs

Sign up for suggestions tailored to the jobs you open and the searches you save.

Apply now
🤖

Whoa — hold up

JobsRadar was built for real people having a rough time in their job search — not for automated requests. You're clicking way too fast and you're now temporarily blocked.

Come back later. If you're genuinely job hunting, we've got your back — just act like a human.

Catch your next role the second it’s posted.

Create a free account and we’ll watch the boards for you — the instant a job matches your search, it lands in your inbox or Telegram. No digging, no refreshing.

Create free account

Free forever · takes 30 seconds · already have one?

Get an edge on your job hunt.

Join our Telegram channel for the stuff that helps you land the role — salary benchmarks, the weekly market pulse, and new-feature drops. No spam, just signal.

Join the channel — it's free