About the role

Makro PRO · Onsite

The Senior AI Engineer owns the enterprise LLM platform substrate that powers every generative-AI consumer across the organisation. This role designs, builds, and operates the LLM Gateway, the evaluation framework, and the AI best-practices playbook that engineers, business teams, and product squads depend on to ship trustworthy AI features at scale.

Key Responsibilities

Design, build, and operate the enterprise LLM Gateway — provider abstraction, authentication, rate limiting, per-use-case cost tracking, prompt logging for audit, and model routing across major providers (Azure OpenAI, AWS Bedrock, Anthropic, Vertex AI).
Operationalise an evaluation framework (Langfuse or equivalent) — tracing, eval scores, human feedback loops — as a platform service consumed by every production GenAI consumer.
Define and enforce evaluation rubrics (accuracy, groundedness, hallucination rate, latency, cost per inference) and embed regression gates in CI so no GenAI consumer ships without a working eval harness.
Ship and operate the GenAI cost dashboard with per-use-case attribution and quarterly forecasts to leadership; drive cost-optimisation initiatives (caching, prompt compression, model routing).
Partner with Data Governance on AI-model governance evidence (audit-log schema, PII redaction proofs, model routing controls) to support legal and compliance approvals for GenAI at scale (Thailand PDPA, regional AI frameworks).
Partner with platform engineering on Vector Search and embedding model selection, retrieval relevance tuning, chunking strategies, and reranker layers.
Author and publish the AI Best Practices Playbook — the standard engineering teams across the company use when shipping LLM features; mentor and review.
Own GenAI platform service-level objectives — availability, latency, cost ceilings — and lead incident response for production GenAI consumers.

Requirements

Bachelor's or Master's degree in Computer Science, AI/ML, Data Science, or a related discipline.
6+ years of software engineering with at least 2+ years shipping production LLM platform systems (gateways, evaluations, cost metering, multi-model routing).
Strong Python production-service development (FastAPI, async, observability, tests).
Hands-on production experience with at least one major LLM provider (Azure OpenAI, AWS Bedrock, Anthropic, Vertex AI), including cost and latency optimisation.
Eval-driven LLM development discipline — golden sets, LLM-as-judge, regression gates in CI, multi-step conversation replay.
Solid grounding in prompt-injection defence, data-leakage prevention, and PII handling for LLM systems.
Cloud production experience (Azure preferred; AWS/GCP transferable) and Git-based CI/CD; comfortable owning service contracts and SLOs.
Excellent written and verbal communication; able to author technical design documents and influence engineering peers and business stakeholders.

Preferred Qualifications

Experience with LLM gateways (Portkey, LiteLLM, Kong AI) or having built one in-house.
Fine-tuning experience (LoRA / QLoRA) and open-weight model deployment (vLLM, TGI).
Thai-language NLP exposure (PyThaiNLP, WangchanBERTa, SEA-LION, Typhoon) and retail / commerce data context (POS, catalog, CRM, loyalty).
Vendor or industry certifications such as Databricks Generative AI Engineer, Azure AI Engineer Associate, or comparable.

Ready to apply to Makro PRO?

Apply to Makro PRO

About the role

Key Responsibilities

Similar jobs

Whoa — hold up

About the role

Key Responsibilities

Similar jobs

Whoa — hold up

Catch your next role the second it’s posted.

Get the worldwide-remote edge.