Companies Cube AI Engineer

About the role

Cube · Onsite

As one of our early AI Engineering hires, you'll help define what AI at Cube looks like. You'll build the AI features people actually use from our self-hosted chat interface and MCP server to retrieval pipelines, prompts, evaluations, and integrations with internal systems. You'll work closely with our Infrastructure and Data Engineering teams to design architecture, connect systems, and transform emerging AI capabilities into practical products and tools that solve real problems every day.

  • Maintain and tunning our self-hosted chat interface including model connections, MCP integration, RAG/knowledge base setup
  • Build the RAG pipeline: ingestion, chunking, embeddings, vector store, retrieval, reranking, and evaluation
  • Integrate LiteLLM or OpenRouter as the gateway; handle routing, fallbacks, rate limits, and cost tracking
  • Maintain and configure MCP server and the tools it exposes to the model
  • Write prompts and evaluations, and iterate on them based on real usage and failure cases
  • Monitoring the logging, tracing, and guardrails of our AI platforms and model does.
  • Good to have exposure on MLOps/Platform team to deploy self-hosted models (vLLM, TGI, Ollama) and keep them healthy
  • Ship features end-to-end: API, retrieval, prompt, evaluation, and rollout

Requirements

  • 4+ years of software engineering experience
  • Familiarity with containerized technologies and orchestration platforms such as Kubernetes
  • Strong interest in AI, LLMs, and the rapidly evolving model ecosystem
  • 1+ years of experience building, deploying, or supporting production LLM systems (RAG, agents, or fine-tuned models)
  • Experience deploying and configuring self-hosted LLM chat interfaces (Open WebUI preferred; similar platforms are acceptable)
  • Hands-on experience with retrieval and RAG systems, including embeddings, vector databases, chunking strategies, hybrid search, and evaluation methodologies
  • Experience working with LLM gateways or routing layers such as LiteLLM, OpenRouter, Portkey, or similar solutions
  • Experience serving open-weight models using tools such as vLLM, TGI, or SGLang
  • Experience designing and implementing secure integrations between LLMs and internal business systems
  • Nice to have: Experience with or understanding of MCP servers, agent frameworks, or tool-calling architectures
  • Nice to have: Experience with or understanding of LLM observability and monitoring platforms such as LangSmith, Langfuse, or similar tools
Ready to apply to Cube?
Apply to Cube
Apply now
🤖

Whoa — hold up

JobsRadar was built for real people having a rough time in their job search — not for automated requests. You're clicking way too fast and you're now temporarily blocked.

Come back later. If you're genuinely job hunting, we've got your back — just act like a human.

Catch your next role the second it’s posted.

Create a free account and we’ll watch the boards for you — the instant a job matches your search, it lands in your inbox or Telegram. No digging, no refreshing.

Create free account

Free forever · takes 30 seconds · already have one?

Get the worldwide-remote edge.

Join our Telegram channel for the stuff that helps you land the role — salary benchmarks, the weekly market pulse, and new-feature drops. No spam, just signal.

Join the channel — it's free