Companies Gimlet Labs Member of Technical Staff - Compilers

About the role

Gimlet Labs · Onsite

About Us

Gimlet is building the next generation of AI infrastructure: large-scale AI datacenters and the orchestration platform that coordinates them.

The future of AI will require vastly more compute than exists today. But as AI workloads become more complex and new hardware architectures emerge, simply deploying more GPUs isn't enough. The challenge is making increasingly diverse compute work together.

Gimlet's platform intelligently partitions and routes workloads across heterogeneous hardware, enabling step-function improvements in performance and efficiency. Customers deploy through production-grade APIs without needing to think about hardware selection, placement, or optimization.

We work with foundation labs, hyperscalers, and AI-native companies to power production workloads at massive scale and help define the infrastructure layer for the future of AI.

About the role

At Gimlet, we believe every hire changes the company.

As a Series A company, talent density matters more than headcount. The engineers we hire today will shape the systems, culture, and standards that define Gimlet for years to come.

The future of AI infrastructure will not be built on a single hardware platform. It will be built on software capable of intelligently orchestrating increasingly heterogeneous compute to unprecedented scale.

Compilers sit at the center of that challenge. The performance gains unlocked at this layer compound across every workload that runs on the platform.

This role is an opportunity to help build the execution stack that transforms modern AI workloads into efficient programs running across diverse hardware architectures.

You will work across compiler infrastructure, runtime systems, scheduling, memory movement, kernel orchestration, and serving optimization to improve how AI workloads are executed in production.

This is not a traditional compiler role.

We are not building a language compiler in isolation.

We are building the systems that determine how AI workloads are partitioned, optimized, scheduled, and executed across the next generation of AI infrastructure.

You'll work on MLIR transformations, execution planning, speculative decoding optimization, heterogeneous scheduling, runtime optimization, and serving infrastructure that powers production AI workloads at scale.

To learn more about the kinds of systems we build, see our work on Corsair and low-latency speculative decoding:

https://gimletlabs.ai/blog/low-latency-spec-decode-corsair

What success looks like

In your first 12-18 months, you will help:

  • Build compiler and runtime infrastructure that improves latency, throughput, and efficiency for large-scale AI inference workloads.

  • Design execution strategies that intelligently partition and coordinate workloads across heterogeneous hardware.

  • Develop compiler optimizations spanning IR transformations, scheduling, memory movement, and kernel orchestration.

  • Enable new model architectures and serving techniques to run efficiently in production environments.

  • Influence the architecture of an execution platform that will help define how AI workloads are deployed over the next decade.

You may be a good fit if

  • Strong systems and performance engineering fundamentals

  • Experience building compiler systems, compiler-adjacent infrastructure, or execution/runtime systems

  • Experience implementing IR transformations, compiler passes, lowering logic, or code generation systems

  • Ability to reason about execution behavior, memory systems, scheduling, and hardware efficiency

  • Strong software engineering skills in C++ and/or Python

Strong candidates may also have

  • Experience with MLIR, LLVM, XLA, TVM, Triton, or similar compiler/runtime infrastructure

  • Experience optimizing ML inference or serving workloads

  • Familiarity with runtime systems, kernel dispatch, launch APIs, or memory allocators

  • Experience working with GPUs, AI accelerators, or heterogeneous hardware systems

  • Experience profiling and debugging performance-critical systems

  • Familiarity with scheduling, partitioning, or kernel-level optimizations

What Makes Gimlet Different

Most AI infrastructure companies are focused on deploying more compute.

We are focused on making increasingly diverse compute work together.

We are not building another cloud platform.

We are building the orchestration layer for the future of AI infrastructure.

We believe the next decade of AI will be defined not only by better hardware, but by the software systems that determine how workloads are executed across that hardware.

The compiler, runtime, and orchestration systems we build today will help define how AI workloads are deployed for years to come.

As an early member of our team, you will have significant ownership, work alongside highly technical engineers, and help shape both the systems we build and how we scale the company.

Ready to apply to Gimlet Labs?
Apply to Gimlet Labs

Similar jobs

Sign up for suggestions tailored to the jobs you open and the searches you save.

Apply now
🤖

Whoa — hold up

JobsRadar was built for real people having a rough time in their job search — not for automated requests. You're clicking way too fast and you're now temporarily blocked.

Come back later. If you're genuinely job hunting, we've got your back — just act like a human.

Catch your next role the second it’s posted.

Create a free account and we’ll watch the boards for you — the instant a job matches your search, it lands in your inbox or Telegram. No digging, no refreshing.

Create free account

Free forever · takes 30 seconds · already have one?

Get the worldwide-remote edge.

Join our Telegram channel for the stuff that helps you land the role — salary benchmarks, the weekly market pulse, and new-feature drops. No spam, just signal.

Join the channel — it's free