Companies People Culture Talent Infrastructure Engineer

About the role

People Culture Talent

Note: We are recruiting on behalf of our valued client. This opportunity is for a position with their organization, not with People Culture Talent. We're excited to help connect talented professionals with this exceptional team!

The Role

Arena Intelligence is the open platform for evaluating how AI models perform in the real world. Born out of UC Berkeley’s SkyLab and the team behind Arena.ai, our leaderboards are the industry’s gold standard for AI model evaluation — trusted by researchers, developers, and enterprises shaping the future of AI.

We’re building the API-based services for enterprise customers— We want to provide enterprises with the best of the ML products (like Arena Max) but customized to their business.

You’ll own core infrastructure that turns our research advantage into an enterprise products. This is a founding role on the developer/enterprise team. You’ll work directly with the founders and early customers to define what we build and how we build it.

What You’ll Do

Build API based products from the ground up. Design and implement a low-latency, high-reliability APIs for leaderboards, models, and arenas.

• Solve hard streaming problems. Handle SSE/streaming responses across heterogeneous providers, including partial failure recovery, mid-stream fallback, and consistent response normalization.

• Ship enterprise-grade infrastructure. Build the systems enterprise customers expect: rate limiting, authentication, usage metering, cost attribution, audit logging, and SOC2 compliance

• Build deep observability. Instrument infrastructure with distributed tracing, latency breakdowns, token-level usage tracking, and real-time dashboards so customers (and we) can see exactly what’s happening.

• Build AI centered products. Integrate with our core evaluation platform, Arena data, and customer-specific benchmarks. Collaborate with the research team to turn novel ideas into full-featured products

• Flex across the stack. Contribute to the backend of our Leaderboards and Evals platforms when needed, helping unify our public and private data architectures.

What We’re Looking For

4+ years of backend engineering experience, with meaningful time spent on distributed systems, infrastructure, or developer-facing platforms.

• Strong proficiency in Go and/or Rust, with hands-on experience building high-throughput APIs or proxy/gateway systems.

• Experience with LLM provider APIs (OpenAI, Anthropic, Google, etc.) and a working understanding of the challenges: streaming, token management, rate limits, model-specific quirks.

• Solid cloud infrastructure skills — you’re comfortable with AWS or GCP, Kubernetes, Terraform, and database systems like Postgres and Redis.

• A product-oriented mindset. You think about the developer experience of your APIs, not just the implementation. You ask “why” before “how.”

• Comfort with ambiguity. We’re a startup. Scope is fluid, context shifts, and you’ll wear many hats. That should sound exciting, not stressful.

Nice to Have

Experience building API gateways, proxies, or developer tools (Bifrost, Kong, Envoy, Tyk, or custom).

• Background in ML infrastructure, model serving, or evaluation frameworks.

• Experience building enterprise-ready features: SSO, RBAC, audit logs, multi-tenancy.

• Familiarity with the modern AI infra stack (vLLM, LiteLLM, LangChain, etc.).


Compensation Band

Their openings span more than one career level. The starting salary for this role is $200k and could range up to $350k USD, plus equity. The provided salary depends on many factors, such as work experience and transferable skills, business needs and impact, and market demands.

Benefits

  • Comprehensive health, dental, vision, and additional support programs.

  • The opportunity to work on cutting-edge AI with a small, mission-driven team.

  • A culture that values transparency, trust, and community impact.

  • Visa sponsorship available.

About Our Client:

Arena Intelligence is redefining what "better" means in AI. Built by researchers from UC Berkeley's SkyLab and backed by Felicis, Andreessen Horowitz, Kleiner Perkins, Lightspeed, and the University of California, this open evaluation platform has become the definitive source for understanding how AI models actually perform in the real world.

With over a million daily users and the trust of every major AI lab — including OpenAI, Google, and Anthropic — their crowdsourced benchmarks and human preference data power the decisions shaping the future of artificial intelligence. Their leaderboards aren't just influential; they're the industry standard.

Behind the platform is a team of researchers, engineers, and builders from UC Berkeley, Google, Stanford, DeepMind, and beyond — people who seek truth, move fast, and care deeply about craftsmanship and impact. They're building a company where deep expertise meets curiosity, and where the work genuinely matters.

Ready to apply to People Culture Talent?
Apply to People Culture Talent

Similar jobs

Sign up for suggestions tailored to the jobs you open and the searches you save.

Apply now
🤖

Whoa — hold up

JobsRadar was built for real people having a rough time in their job search — not for automated requests. You're clicking way too fast and you're now temporarily blocked.

Come back later. If you're genuinely job hunting, we've got your back — just act like a human.

Catch your next role the second it’s posted.

Create a free account and we’ll watch the boards for you — the instant a job matches your search, it lands in your inbox or Telegram. No digging, no refreshing.

Create free account

Free forever · takes 30 seconds · already have one?

Get the worldwide-remote edge.

Join our Telegram channel for the stuff that helps you land the role — salary benchmarks, the weekly market pulse, and new-feature drops. No spam, just signal.

Join the channel — it's free