Companies quadric, Inc AI Inference Engineer

About the role

quadric, Inc

Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices, ranging from battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems. Unlike other NPUs or neural network accelerators in the industry today that can only accelerate a portion of a machine learning graph, the Quadric GPNPU executes both NN graph code and conventional C++ DSP and control code.

Role

The AI Inference Engineer in Quadric is the key bridge between the world of AI/LLM models and Quadric unique platforms. The AI Inference Engineer at Quadric will [1] port AI models to Quadric platform; [2] optimize the model deployment for efficient inference; [3] profile and benchmark the model performance. This senior technical role demands deep knowledge of AI model algorithms, system architecture and AI toolchains/frameworks.

This California Bay Area based role follows a hybrid schedule, with at least two in-office days per week at our Burlingame office, the ability to commute regularly, and occasional additional onsite days as needed based on team and business priorities. The team and company also gather periodically for onsite meetings and offsite events, which are valued opportunities to connect, collaborate, and align.

Responsibilities

  • Quantize, prune and convert models for deployment
  • Port models to Quadric platform using Quadric toolchain
  • Optimize inference deployment for latency, speed
  • Benchmark and profile model performance and accuracy
  • Collaborate across related areas of the AI inference stack to support team and business priorities
  • Develop tools to scale and speed up the deployment
  • Make Improvement to SDK and runtime
  • Provide technical support and documents to customers and developer community

Requirements

  • Bachelor’s or Master’s in Computer Science and/or Electric Engineering.
  • 5+ years of experience in AI/LLM model inference and deployment frameworks/tools
  • experience with model quantization (PTQ, QAT) and tools
  • experience with model accuracy measures
  • experience with model inference performance profiling
  • experience with at least one of the following frameworks: onnxruntime, Pytorch, vLLM, huggingface-transformer, neural-compressor, llamacpp
  • Proficiency in C/C++ and Python
  • Demonstrate good capability in problem solving, debug and communication

Benefits

At Quadric, we value Integrity, Humility, and Happiness. What we expect from one another is simple and clear: Initiative, Collaboration, and Completion. We are a collaborative team focused on building something extraordinary in the edge computing space.

  • Competitive salary and meaningful equity
  • Medical, dental, and vision plans starting on day one
  • 401(k) retirement plan
  • Flexible paid time off (unlimited, non-accrual) to support work-life balance
  • When working in-office, enjoy company-provided lunches and a stocked kitchen
  • Convenient office location within walking distance of the Caltrain station
  • Support for commuting, including monthly parking or Caltrain passes
  • Downtown Burlingame office location, close to shops, cafes, and local amenities
  • A politics-free, highly collaborative environment where talented people can do their best work and make an immediate impact
  • The opportunity to build long-term career relationships in a company that values strong personal connections alongside professional excellence

The base salary range for this position is $110,000 to $270,000. This range reflects the full span of levels and geographies at which Quadric hires for this role. The actual base salary offered will depend on a number of factors, including the specific level of the role, years and depth of relevant experience, technical skills and competencies, the criticality of the role to the business, internal equity, and work location. In addition to base salary, this role is eligible for equity and a discretionary annual performance bonus as applicable to the role and level. 

Quadric also offers the generous benefits package outlined above and other programs designed to support your health and wellbeing.

Founded in 2016 and based in downtown Burlingame, California, Quadric is building the world’s first supercomputer designed for the real-time needs of edge devices. Quadric aims to empower developers in every industry with superpowers to create tomorrow’s technology, today. The company was co-founded by technologists from MIT and Carnegie Mellon, who were previously the technical co-founders of the Bitcoin computing company 21.

Quadric is proud to be an equal opportunity employer. We are committed to creating an inclusive environment where people from all backgrounds can do their best work. We consider all qualified applicants without regard to race, color, religion, sex, gender identity or expression, sexual orientation, national origin, age, disability, veteran status, or any other protected characteristic under applicable law.

If this role resonates with you, we encourage you to apply even if your experience does not perfectly match every qualification. We value potential, curiosity, and a willingness to learn just as much as direct experience. Skills and growth come in many forms, and we would love to hear your story.

By submitting an application, you acknowledge that Quadric will collect and process your personal information as part of the hiring process. Please review our Privacy Policy to understand how we handle your data.

Ready to apply to quadric, Inc?
Apply to quadric, Inc

Similar jobs

quadric, Inc
AI Inference Engineer Intern - Model Pruning
quadric, Inc
⚡ Apply early Burlingame, California, United... $93,600–$93,600
● New 👁 Seen ✓ Applied 1mo ago
quadric, Inc
AI Kernel Engineer Intern - Kernel Optimization
quadric, Inc
⚡ Apply early Burlingame, California, United... $93,600–$93,600
● New 👁 Seen ✓ Applied 1mo ago
quadric, Inc
Data Scientist, New Grad - Model Optimization
quadric, Inc
⚡ Apply early Burlingame, California, United... $120,000–$160,000
● New 👁 Seen ✓ Applied 1mo ago
quadric, Inc
Design Verification Intern
quadric, Inc
⚡ Apply early Burlingame, California, United... $93,600–$93,600
● New 👁 Seen ✓ Applied 1mo ago
quadric, Inc
Senior Product Manager, Hardware (NPU IP)
quadric, Inc
⚡ Apply early Burlingame, California, United... $200,000–$250,000
● New 👁 Seen ✓ Applied 1mo ago
quadric, Inc
Senior Product Manager, Software & Developer Platform
quadric, Inc
⚡ Apply early Burlingame, California, United... $200,000–$250,000
● New 👁 Seen ✓ Applied 1mo ago
quadric, Inc
Deep Learning Compiler Engineer (New Grad)
quadric, Inc
⚡ Apply early Burlingame, California, United... $120,000–$160,000
● New 👁 Seen ✓ Applied 1mo ago
quadric, Inc
AI Kernel Engineer (New Grad)
quadric, Inc
⚡ Apply early Burlingame, California, United... $120,000–$160,000
● New 👁 Seen ✓ Applied 1mo ago
quadric, Inc
Field Application Engineer (Machine Learning)
quadric, Inc
⚡ Apply early Pune, Maharashtra, India
● New 👁 Seen ✓ Applied 2mos ago

Sign up for suggestions tailored to the jobs you open and the searches you save.

Apply now
🤖

Whoa — hold up

JobsRadar was built for real people having a rough time in their job search — not for automated requests. You're clicking way too fast and you're now temporarily blocked.

Come back later. If you're genuinely job hunting, we've got your back — just act like a human.

Catch your next role the second it’s posted.

Create a free account and we’ll watch the boards for you — the instant a job matches your search, it lands in your inbox or Telegram. No digging, no refreshing.

Create free account

Free forever · takes 30 seconds · already have one?

Get the worldwide-remote edge.

Join our Telegram channel for the stuff that helps you land the role — salary benchmarks, the weekly market pulse, and new-feature drops. No spam, just signal.

Join the channel — it's free