Cast AI is the leading Application Performance Automation (APA) platform, enabling customers to cut cloud costs, improve performance, and boost productivity – automatically.
Built originally for Kubernetes, Cast AI goes beyond cost and observability by delivering real-time, autonomous optimization across any cloud environment. The platform continuously analyzes workloads, rightsizes resources, and rebalances clusters without manual intervention, ensuring applications run faster, more reliably, and more efficiently.
Headquartered in Miami, Florida, Cast AI has employees in more than 32 countries worldwide and supports some of the world’s most innovative teams running their applications on all major cloud, hybrid, and on-premises environments. Over 2,100 companies already rely on Cast - from BMW and Akamai to Hugging Face and NielsenIQ.
What’s next? Backed by our $108M Series C, we’re doubling down on making APA the new standard for DevOps and MLOps, and everything in between.
Pick a job to read the details
Tap any role on the left — its description and apply link will open here.
Share this job
Cast AI is an automation platform that operates cloud-native and AI infrastructure at scale. By embedding autonomous decision-making directly into Kubernetes and cloud environments, Cast AI continuously optimizes performance, reliability, and efficiency in production.
The old way doesn't work. As Kubernetes and AI environments grow, manual decisions don’t. Cast AI replaces tickets, alerts, and manual tuning with continuous automation that adapts infrastructure as conditions change. Efficiency and cost savings follow naturally from that automation.
Over 2,100 companies already rely on Cast AI, including Akamai, BMW, Cisco, FICO, HuggingFace, NielsenIQ, Swisscom, and TGS.
Global team, diverse perspectives
We're headquartered in Miami, but our impact is international. We take a global and intentional approach to diversity. Today, Cast AI operates across 34 countries spanning Europe, North America, Latin America, and APAC, bringing a wide range of perspectives into how we build and lead.
Unicorn momentum
In January 2026, we achieved unicorn status with a strategic investment from Pacific Alliance Ventures, the corporate venture arm of Shinsegae Group (a $50+ billion Korean conglomerate). Our valuation now exceeds $1 billion, and we're just getting started.
Join us as we build the future of autonomous infrastructure.
Throughput. Latency. KV cache utilization.
Move those three numbers in the right direction, and two things happen: customers get faster, cheaper inference, and our margins improve. That's the entire thesis of this role. Every kernel you tune, every quantization scheme you ship, every scheduler tweak you land shows up directly in a customer's p99 and on our P&L.
This is a high-impact seat. It is also a high-autonomy seat as you'll be given the room to lead the technical direction of inference optimization at Kimchi, not execute someone else's roadmap.
The problem: running LLMs in production is a moving target. The "right" model and serving configuration for a workload depend on traffic shape, sequence-length distribution, batch dynamics, GPU SKU, memory bandwidth, quantization tolerance, and a dozen other variables that shift week to week. Most teams pick a model once, over-provision GPUs, and absorb the cost. Kimchi is the system that makes that decision automatically - continuously matching workloads to the most cost-efficient, best-performing LLM and serving configuration on a customer's infrastructure. We're building the optimization layer between the model and the hardware, and we need engineers who understand both sides deeply.
Stack
Python; vLLM; SGLang; TensorRT-LLM; PyTorch; CUDA-adjacent tooling; Kubernetes; gRP; ClickHouse; PostgreSQL; GCP Pub/Sub; AWS / GCP / Azure; GitLab CI; ArgoCD; Prometheus; Grafana; Loki; Tempo.
As part of our standard hiring process, we would like to inform you that a background check may be conducted at the final stage of recruitment through our third-party provider, Checkr.
Please note that Cast AI does not provide any form of visa sponsorship/work permit.
#LI-Remote
Ready to apply?
Apply to Cast AI
Share this job
Cast AI is an automation platform that operates cloud-native and AI infrastructure at scale. By embedding autonomous decision-making directly into Kubernetes and cloud environments, Cast AI continuously optimizes performance, reliability, and efficiency in production.
The old way doesn't work. As Kubernetes and AI environments grow, manual decisions don’t. Cast AI replaces tickets, alerts, and manual tuning with continuous automation that adapts infrastructure as conditions change. Efficiency and cost savings follow naturally from that automation.
Over 2,100 companies already rely on Cast AI, including Akamai, BMW, Cisco, FICO, HuggingFace, NielsenIQ, Swisscom, and TGS.
Global team, diverse perspectives
We're headquartered in Miami, but our impact is international. We take a global and intentional approach to diversity. Today, Cast AI operates across 34 countries spanning Europe, North America, Latin America, and APAC, bringing a wide range of perspectives into how we build and lead.
Unicorn momentum
In January 2026, we achieved unicorn status with a strategic investment from Pacific Alliance Ventures, the corporate venture arm of Shinsegae Group (a $50+ billion Korean conglomerate). Our valuation now exceeds $1 billion, and we're just getting started.
Join us as we build the future of autonomous infrastructure.
Kimchi is an open-source AI inference platform built for teams running serious agentic coding workloads. We're building the harness, the routing, and the infrastructure layer that makes running your own AI coding stack possible - and affordable. We're early, moving fast, and the developer community is central to how we grow.
*As part of our standard hiring process, we would like to inform you that a background check may be conducted at the final stage of recruitment through our third-party provider, Checkr.
*Please note that Cast AI does not provide any form of visa sponsorship/work permit.
#LI-Remote
Ready to apply?
Apply to Cast AI
Share this job
Cast AI is an automation platform that operates cloud-native and AI infrastructure at scale. By embedding autonomous decision-making directly into Kubernetes and cloud environments, Cast AI continuously optimizes performance, reliability, and efficiency in production.
The old way doesn't work. As Kubernetes and AI environments grow, manual decisions don’t. Cast AI replaces tickets, alerts, and manual tuning with continuous automation that adapts infrastructure as conditions change. Efficiency and cost savings follow naturally from that automation.
Over 2,100 companies already rely on Cast AI, including Akamai, BMW, Cisco, FICO, HuggingFace, NielsenIQ, Swisscom, and TGS.
Global team, diverse perspectives
We're headquartered in Miami, but our impact is international. We take a global and intentional approach to diversity. Today, Cast AI operates across 34 countries spanning Europe, North America, Latin America, and APAC, bringing a wide range of perspectives into how we build and lead.
Unicorn momentum
In January 2026, we achieved unicorn status with a strategic investment from Pacific Alliance Ventures, the corporate venture arm of Shinsegae Group (a $50+ billion Korean conglomerate). Our valuation now exceeds $1 billion, and we're just getting started.
Join us as we build the future of autonomous infrastructure.
This is a location-specific opportunity. We are currently accepting applications from candidates residing in the following European countries: Bulgaria, Croatia, Estonia, Greece, Hungary, Latvia, Lithuania, Poland, Romania, Slovakia, Slovenia, and Ukraine.
You'll join one of several teams building the low-level systems behind CAST AI's multi-cloud automation platform. The work sits at the intersection of Kubernetes, cloud infrastructure, and Linux.
Depending on your strengths, you'll work in one of these areas:
Common across all teams: Go, deep Kubernetes, multi-cloud, building beyond what the ecosystem offers.
How we build
We invest heavily in agentic development and AI-powered tooling. Engineers work with code agents and automated workflows daily. We expect you to shape how these tools evolve.
What would make you stand out
*As part of our standard hiring process, we would like to inform you that a background check may be conducted at the final stage of recruitment through our third-party provider, Checkr.
*Please note that Cast AI does not provide any form of visa sponsorship/work permit.
#LI-Remote
Ready to apply?
Apply to Cast AI
Share this job
Kimchi is the AI platform inside CAST AI. We started by helping companies run LLMs on their own Kubernetes clusters - now we're building the execution layer where agents do real work.
Our Infrastructure today: multi-model inference (MiniMax, Kimi, GLM-5, Nemotron, DeepSeek) with intelligent routing, an OpenAI-compatible API, and deployment flexibility from our GPUs to your VPC. The inference layer is the foundation. What we're hiring for sits on top of it: coding agents, agent runtimes, orchestration systems, and the reliability engineering that makes them actually finish things.
Tech Stack: TypeScript, Go, Kubernetes, AWS/GCP/Azure, MCP, Prometheus/Grafana/Loki, GitLab CI, ArgoCD.
Why harness engineering matters here
OpenAI and Anthropic ship models. They also ship one harness each - the scaffolding that turns a raw model into something that can plan, execute, recover, and complete work. We ship a different kind of harness: one built for cost-conscious, long-horizon autonomy, running on inference infrastructure we control end-to-end.
A decent model with a great harness beats a great model with a bad harness. We've watched this play out. The gap between what today's models can do and what you see them doing is largely a harness gap - and that gap is where we operate.
What you'll build
The ratchet.
Every time our agent makes a mistake, we engineer a solution so it never makes that mistake again. That means hooks that enforce constraints the model "knows" but forgets: pre-commit lint checks, permission gates, context compaction before the window fills. Success is silent, failures are verbose.
Long-horizon execution.
Our harness is built around spec-driven autonomy: meta-prompting, fresh context per task, worktree-per-slice git strategy, automatic replanning, crash recovery, stuck detection. We're implementing Ralph loops - when the model tries to exit, we intercept and reinject the goal into a fresh context. The agent reads state from disk and continues. Multi-session, multi-day work, without context rot.
Planner/executor splits.
Planning with a reasoning model, executing with a fast one, evaluating with a third. Separating generation from evaluation beats self-verification because agents reliably skew positive when grading their own work.
The harness surface.
CLI, TUI, MCP integration, sandboxed execution, telemetry. Our AGENTS.md is short - every line traces to a specific thing that went wrong. TypeScript on the surface, Go where it matters.
Memory and context.
Moving agents off laptops, giving them state that survives across sessions, managing context so information lands where it's actionable. Compaction, tool-call offloading, progressive skill disclosure.
What makes this different (with receipts)
You've seen the pitch: "we route to the best model." Everyone says that. Here's what we actually have:
What success looks like (after 6 months):
This is a location-specific opportunity. We are currently accepting applications from candidates residing in the following European countries: Bulgaria, Croatia, Estonia, Greece, Hungary, Latvia, Lithuania, Poland, Romania, Slovakia, Slovenia, and Ukraine.
*As part of our standard hiring process, we would like to inform you that a background check may be conducted at the final stage of recruitment through our third-party provider, Checkr.
*Please note that Cast AI does not provide any form of visa sponsorship/work permit.
#LI-Remote
Ready to apply?
Apply to Cast AI
Share this job
Cast AI is an automation platform that operates cloud-native and AI infrastructure at scale. By embedding autonomous decision-making directly into Kubernetes and cloud environments, Cast AI continuously optimizes performance, reliability, and efficiency in production.
The old way doesn't work. As Kubernetes and AI environments grow, manual decisions don’t. Cast AI replaces tickets, alerts, and manual tuning with continuous automation that adapts infrastructure as conditions change. Efficiency and cost savings follow naturally from that automation.
Over 2,100 companies already rely on Cast AI, including Akamai, BMW, Cisco, FICO, HuggingFace, NielsenIQ, Swisscom, and TGS.
Global team, diverse perspectives
We're headquartered in Miami, but our impact is international. We take a global and intentional approach to diversity. Today, Cast AI operates across 34 countries spanning Europe, North America, Latin America, and APAC, bringing a wide range of perspectives into how we build and lead.
Unicorn momentum
In January 2026, we achieved unicorn status with a strategic investment from Pacific Alliance Ventures, the corporate venture arm of Shinsegae Group (a $50+ billion Korean conglomerate). Our valuation now exceeds $1 billion, and we're just getting started.
Join us as we build the future of autonomous infrastructure.
This is a location-specific opportunity. We are currently accepting applications from candidates residing in the following European countries: Bulgaria, Croatia, Estonia, Greece, Hungary, Latvia, Lithuania, Poland, Romania, Slovakia, Slovenia, and Ukraine.
As a Senior Software Engineer, you will have the opportunity to work on different key features of our product. We are currently hiring Senior Software Engineers for the following teams:
- Workload Optimization - Automates workload resource management by dynamically adjusting resource allocations, helping developers significantly reduce costs and improve application reliability.
- Karpenter - The Karpenter team powers the integration between Karpenter and Cast AI, bringing enterprise capabilities to the most popular open source Kubernetes autoscaler. We enhance Karpenter with advanced features that improve application reliability and performance while optimizing costs. By joining the team, you’ll bridge open source innovation with enterprise requirements, directly impacting how organizations run Karpenter at scale.
- Reporting - Builds a scalable reporting system that ingests millions of rows per second into our time-series databases, providing insights into cost savings, workload efficiencies, and Cast AI automation impact.
- Pricing - Drives the synchronization of public and customer cloud resources, availability, and dynamic pricing across all major cloud providers. Empowers autoscaling by leveraging discounts, commitments, and cross-cluster tracking to maximize savings. Provides a reliable source of truth for node pricing, resources, components, discounts, and commitments.
- Autoscaler - Automates Kubernetes node autoscaling to optimize clusters, balance workloads, remove underutilized nodes, and dynamically allocate capacity in real-time, thereby reducing cluster costs by half.
- Identity - Builds and maintains the trust and access foundation for the entire platform, ensuring every user, service, and workload authenticates and interacts securely and seamlessly at scale.
- Billy - Powers the critical day-2 operations layer of the platform - from billing and audit trails to notifications and feature flags - ensuring the platform runs reliably, transparently, and at scale for every customer, every day.
*As part of our standard hiring process, we would like to inform you that a background check may be conducted at the final stage of recruitment through our third-party provider, Checkr.
*Please note that Cast AI does not provide any form of visa sponsorship/work permit.
#LI-Remote
Ready to apply?
Apply to Cast AI
Cast AI is an automation platform that operates cloud-native and AI infrastructure at scale. By embedding autonomous decision-making directly into Kubernetes and cloud environments, Cast AI continuously optimizes performance, reliability, and efficiency in production.
The old way doesn't work. As Kubernetes and AI environments grow, manual decisions don’t. Cast AI replaces tickets, alerts, and manual tuning with continuous automation that adapts infrastructure as conditions change. Efficiency and cost savings follow naturally from that automation.
Over 2,100 companies already rely on Cast AI, including Akamai, BMW, Cisco, FICO, HuggingFace, NielsenIQ, Swisscom, and TGS.
Global team, diverse perspectives
We're headquartered in Miami, but our impact is international. We take a global and intentional approach to diversity. Today, Cast AI operates across 34 countries spanning Europe, North America, Latin America, and APAC, bringing a wide range of perspectives into how we build and lead.
Unicorn momentum
In January 2026, we achieved unicorn status with a strategic investment from Pacific Alliance Ventures, the corporate venture arm of Shinsegae Group (a $50+ billion Korean conglomerate). Our valuation now exceeds $1 billion, and we're just getting started.
Join us as we build the future of autonomous infrastructure.
As Product Marketing Manager for Cast AI’s Core Kubernetes Optimization Platform, you’ll lead the go-to-market strategy for our flagship offering. You’ll translate powerful technology into clear value for technical audiences, helping DevOps, SREs, and platform engineers understand how Cast AI automatically improves performance, resilience, and cost efficiency.
This is a highly technical and strategic role where you’ll act as the voice of the product in the market and the voice of the customer internally.
*As part of our standard hiring process, we would like to inform you that a background check may be conducted at the final stage of recruitment through our third-party provider, Checkr.
*Please note that Cast AI does not provide any form of visa sponsorship/work permit.
#LI-Remote
Ready to apply?
Apply to Cast AI
Share this job
Cast AI is an automation platform that operates cloud-native and AI infrastructure at scale. By embedding autonomous decision-making directly into Kubernetes and cloud environments, Cast AI continuously optimizes performance, reliability, and efficiency in production.
The old way doesn't work. As Kubernetes and AI environments grow, manual decisions don’t. Cast AI replaces tickets, alerts, and manual tuning with continuous automation that adapts infrastructure as conditions change. Efficiency and cost savings follow naturally from that automation.
Over 2,100 companies already rely on Cast AI, including Akamai, BMW, Cisco, FICO, HuggingFace, NielsenIQ, Swisscom, and TGS.
Global team, diverse perspectives
We're headquartered in Miami, but our impact is international. We take a global and intentional approach to diversity. Today, Cast AI operates across 34 countries spanning Europe, North America, Latin America, and APAC, bringing a wide range of perspectives into how we build and lead.
Unicorn momentum
In January 2026, we achieved unicorn status with a strategic investment from Pacific Alliance Ventures, the corporate venture arm of Shinsegae Group (a $50+ billion Korean conglomerate). Our valuation now exceeds $1 billion, and we're just getting started.
Join us as we build the future of autonomous infrastructure.
Databases are notoriously hard to optimize—with 350+ configuration parameters in PostgreSQL alone, even experienced engineers struggle to find the right settings. Have you ever dreamed about a database that continuously self-optimizes itself in the background, so you can focus on what really matters: building great products?
Sometimes your queries slow down and it's not immediately clear why. Imagine a system that proactively detects performance degradation, analyzes the root cause, and automatically improves your queries and indexes for you—creating pull requests with the exact optimizations needed.
CAST AI is seeking a skilled senior software engineer with expertise in databases and interest in understanding the underlying technology that makes them work. Knowledge of implementation details around transactions, wire protocols, WAL, indexes, query planning, and other fundamental database features will enable you to build a product that automates what's traditionally been manual and complex.
You'll work on a greenfield project with high impact, creating intelligent systems that make database optimization effortless for our customers.
This is a location-specific opportunity. We are currently accepting applications from candidates residing in the following European countries: Bulgaria, Croatia, Estonia, Greece, Hungary, Latvia, Lithuania, Poland, Romania, Slovakia, Slovenia, and Ukraine.
#LI-Remote
Ready to apply?
Apply to Cast AI
Cookies & analytics
This site uses cookies from third-party services to deliver its features and to analyze traffic.