Companies Navan Senior Machine Learning Engineer- LLMs & Self-Hosted AI

About the role

Navan

We are looking for a highly skilled Senior ML Engineer to lead our transition from third-party LLM APIs to a fully self-hosted ecosystem by fine-tuning high-performance, domain-specific models.

Our core product is an advanced, agentic support chatbot capable of complex reasoning, API tool calling, database lookups, and orchestrating specialized LLMs for specific tasks. 

What You’ll Do:

  • Model Fine-Tuning: Design and execute fine-tuning strategies to improve model accuracy on specific domain tasks and tool-calling execution.
  • Agentic Workflows: Develop and refine the chatbot's agentic capabilities, ensuring reliable tool-use, routing, and interactions between massive LLMs and specialized SLMs.
  • Inference Optimization: Deploy and manage large-scale models using high-performance inference engines (like vLLM) to ensure low latency and high throughput for our agentic chatbot.
  • Rigorous Evaluation: Build comprehensive offline and online evaluation frameworks to constantly measure model performance and business impact through structured A/B testing.

What We’re Looking For:

Core Engineering & AI Frameworks

  • Deep experience with PyTorch and the Hugging Face ecosystem.
  • Strong Data Engineering skills: data manipulation, synthetic data generation, and active learning/margin-sampling.
  • High proficiency with AI-assisted development workflows (e.g., Claude Code, Cursor, Codex) to accelerate development.

LLMs & Agents

  • Strong fundamental understanding of LLM architectures, attention mechanisms, and generation parameters.
  • Hands-on experience building Agentic systems (ReAct, function/tool calling, RAG).
  • Expertise in fine-tuning strategies (e.g., SFT, RLHF, DPO) and parameter-efficient techniques (PEFT/LoRA).

Bonus Points

  • Alignment Techniques: Experience with RLHF and DPO strategies for future reasoning-model development.
  • Containerization & Orchestration: Experience with Ray for orchestrating large-scale model deployments across multi-GPU clusters.
  • Model Quantization: Experience with memory optimization techniques like AWQ, GPTQ, or GGUF to fit 70B models efficiently onto hardware.
  • API Development: Proficiency in building robust, asynchronous microservices using FastAPI to serve model requests.
  • Experience with core MLOps practices, including dataset versioning (e.g., DVC), experiment tracking (e.g., Weights & Biases, MLflow), and model registries.
Ready to apply to Navan?
Apply to Navan
Apply now
🤖

Whoa — hold up

JobsRadar was built for real people having a rough time in their job search — not for automated requests. You're clicking way too fast and you're now temporarily blocked.

Come back later. If you're genuinely job hunting, we've got your back — just act like a human.

Catch your next role the second it’s posted.

Create a free account and we’ll watch the boards for you — the instant a job matches your search, it lands in your inbox or Telegram. No digging, no refreshing.

Create free account

Free forever · takes 30 seconds · already have one?

Get the worldwide-remote edge.

Join our Telegram channel for the stuff that helps you land the role — salary benchmarks, the weekly market pulse, and new-feature drops. No spam, just signal.

Join the channel — it's free