Pick a job to read the details

Tap any role on the left — its description and apply link will open here.

Senior ML Engineer - Kimchi (LLM Inference Optimization)

Cast AI · Austria; France; Germany; Italy; Netherlands; Poland; Spain; United Kingdom

Technology European Union Posted May 8, 2026

Why Cast AI?

Cast AI is an automation platform that operates cloud-native and AI infrastructure at scale. By embedding autonomous decision-making directly into Kubernetes and cloud environments, Cast AI continuously optimizes performance, reliability, and efficiency in production.
The old way doesn't work. As Kubernetes and AI environments grow, manual decisions don’t. Cast AI replaces tickets, alerts, and manual tuning with continuous automation that adapts infrastructure as conditions change. Efficiency and cost savings follow naturally from that automation.
Over 2,100 companies already rely on Cast AI, including Akamai, BMW, Cisco, FICO, HuggingFace, NielsenIQ, Swisscom, and TGS.

Global team, diverse perspectives

We're headquartered in Miami, but our impact is international. We take a global and intentional approach to diversity. Today, Cast AI operates across 34 countries spanning Europe, North America, Latin America, and APAC, bringing a wide range of perspectives into how we build and lead.

Unicorn momentum

In January 2026, we achieved unicorn status with a strategic investment from Pacific Alliance Ventures, the corporate venture arm of Shinsegae Group (a $50+ billion Korean conglomerate). Our valuation now exceeds $1 billion, and we're just getting started.

Join us as we build the future of autonomous infrastructure.

About the role

Throughput. Latency. KV cache utilization.

Move those three numbers in the right direction, and two things happen: customers get faster, cheaper inference, and our margins improve. That's the entire thesis of this role. Every kernel you tune, every quantization scheme you ship, every scheduler tweak you land shows up directly in a customer's p99 and on our P&L.
This is a high-impact seat. It is also a high-autonomy seat as you'll be given the room to lead the technical direction of inference optimization at Kimchi, not execute someone else's roadmap.

The problem: running LLMs in production is a moving target. The "right" model and serving configuration for a workload depend on traffic shape, sequence-length distribution, batch dynamics, GPU SKU, memory bandwidth, quantization tolerance, and a dozen other variables that shift week to week. Most teams pick a model once, over-provision GPUs, and absorb the cost. Kimchi is the system that makes that decision automatically - continuously matching workloads to the most cost-efficient, best-performing LLM and serving configuration on a customer's infrastructure. We're building the optimization layer between the model and the hardware, and we need engineers who understand both sides deeply.

Stack

Python; vLLM; SGLang; TensorRT-LLM; PyTorch; CUDA-adjacent tooling; Kubernetes; gRP; ClickHouse; PostgreSQL; GCP Pub/Sub; AWS / GCP / Azure; GitLab CI; ArgoCD; Prometheus; Grafana; Loki; Tempo.

Requirements:

5+ years building real ML systems, with a portfolio that shows depth in inference or training infrastructure (not just model training notebooks).
Strong Python - production services, not scripts.
Hands-on experience with at least one of vLLM, SGLang, or TensorRT-LLM, and a working mental model of why an inference engine performs the way it does on a given GPU.
Fluency with quantization tradeoffs - you've measured quality regressions, not just compression ratios.
Comfort with distributed systems: collective communication, sharding strategies, and the practical failure modes of multi-GPU and multi-node setups.
A bias toward measurement. You instrument before you optimize, and you can tell the difference between a real win and a benchmark artifact.
Self-direction. This role comes with a wide mandate; you should be excited by that, not unsettled by it.

Responsibilities:

Push throughput. Continuous batching, speculative decoding, chunked prefill, kernel-level tuning across vLLM, SGLang, and TensorRT-LLM. Find the ceiling on each GPU SKU, then raise it.
Cut latency. Attack TTFT and TPOT separately. Profile, identify the actual bottleneck (compute, memory bandwidth, scheduling, networking), and fix it - not the bottleneck you assumed.
Get more out of the KV cache. Paged attention, prefix caching, eviction policies, cache reuse across requests, quantized KV. This is where a lot of the unrealized throughput lives, and it's an area you'll own.
Quantize without regressing quality. INT8, INT4, FP8 across weights, activations, and KV. Empirical work: measure quality on real workloads, not just perplexity benchmarks.
Shrink cold starts and memory footprint. Faster init, smarter weight loading, tighter memory accounting - the difference between a model that scales and one that doesn't.
Scale across nodes. Distributed inference topologies, network-aware placement, checkpointing strategies that don't bottleneck on storage or interconnect.
Set the technical direction. Decide what we benchmark, what we adopt, and what we build ourselves. Bring the team along with strong writeups and reproducible experiments.

What’s in it for you?

Competitive salary (depending on the level of experience).
Enjoy a flexible, remote-first global environment.
Collaborate with a global team of cloud experts and innovators, passionate about pushing the boundaries of Kubernetes technology
Equity options.
Get quick feedback with a fast-paced workflow. Most feature projects are completed in 1 to 4 weeks.
Spend 10% of your work time on personal projects or self-improvement.
Learning budget for professional and personal development - including access to international conferences and courses that elevate your skills.
Annual hackathon to spark new ideas and strengthen team bonds.
Team-building budget and company events to connect with your colleagues.
Equipment budget to ensure you have everything you need.
Extra days off to help maintain a healthy work-life balance.

Hiring process

Screening call with Recruiter
Hiring Manager interview
Technical interview (system design)
Live coding
Culture Check interview with an executive

As part of our standard hiring process, we would like to inform you that a background check may be conducted at the final stage of recruitment through our third-party provider, Checkr.
Please note that Cast AI does not provide any form of visa sponsorship/work permit.

#LI-Remote

Ready to apply?

Apply to Cast AI

Cast AI

View all jobs →

Developer Relations Engineer - Kimchi

Cast AI · France; Greece; Hungary; Poland; Romania; Slovakia; United Kingdom

Apply now

Business Development European Union Posted May 8, 2026

Why Cast AI?

Cast AI is an automation platform that operates cloud-native and AI infrastructure at scale. By embedding autonomous decision-making directly into Kubernetes and cloud environments, Cast AI continuously optimizes performance, reliability, and efficiency in production.

The old way doesn't work. As Kubernetes and AI environments grow, manual decisions don’t. Cast AI replaces tickets, alerts, and manual tuning with continuous automation that adapts infrastructure as conditions change. Efficiency and cost savings follow naturally from that automation.

Over 2,100 companies already rely on Cast AI, including Akamai, BMW, Cisco, FICO, HuggingFace, NielsenIQ, Swisscom, and TGS.

Global team, diverse perspectives
We're headquartered in Miami, but our impact is international. We take a global and intentional approach to diversity. Today, Cast AI operates across 34 countries spanning Europe, North America, Latin America, and APAC, bringing a wide range of perspectives into how we build and lead.

Unicorn momentum
In January 2026, we achieved unicorn status with a strategic investment from Pacific Alliance Ventures, the corporate venture arm of Shinsegae Group (a $50+ billion Korean conglomerate). Our valuation now exceeds $1 billion, and we're just getting started.

Join us as we build the future of autonomous infrastructure.

Kimchi is an open-source AI inference platform built for teams running serious agentic coding workloads. We're building the harness, the routing, and the infrastructure layer that makes running your own AI coding stack possible - and affordable. We're early, moving fast, and the developer community is central to how we grow.

Responsibilities:

Build things publicly with Kimchi - tutorials, demos, integrations - that show developers how to solve real problems. Think live build logs, OSS integrations, or benchmarks you'd want to read yourself.
Be the voice of the the developer community: Discord, GitHub discussions, Reddit, HN - be present where developers are already talking about the problems we solve
Be the structured feedback loop between dev community and product - synthesize what you're hearing into actionable signals for the product team.
Write technical content that ranks and spreads: setup guides, benchmark breakdowns, comparison posts, integration walkthroughs
Speak at conferences, meetups and run workshops - representing Kimchi as a technical peer
Identify where new users get stuck and advocate hard for fixes
Help shape Kimchi's voice in the developer community - what we stand for, what we say when there is trending news in the market

Requirements:

You've built real software and can talk about it credibly - ideally you've used agentic coding tools and have opinions about them
You're a strong technical writer - you can explain complex things clearly and make them resonate with the developer audience
You're comfortable being the public face of a technical product: blog posts, talks, live demos, community threads
Huge plus: You'd have tested and found problems Kimchi solves interesting - before this job posting even existed
Bonus: you've worked in or around Kubernetes, LLM inference, or developer tooling

What’s in it for you?

Enjoy a flexible, remote-first global environment.
Collaborate with a global team of cloud experts and innovators, passionate about pushing the boundaries of Kubernetes technology.
Equity options.
Get quick feedback with a fast-paced workflow. Most feature projects are completed in 1 to 4 weeks.
Spend 10% of your work time on personal projects or self-improvement.
Learning budget for professional and personal development - including access to international conferences and courses that elevate your skills.
Team-building budget and company events to connect with your colleagues.
Equipment budget to ensure you have everything you need.
Extra days off to help maintain a healthy work-life balance.

Hiring process

Screening call with Recruiter
Hiring Manager interview
1-2 additional interviews based on the role
Culture Check interview with an executive

*As part of our standard hiring process, we would like to inform you that a background check may be conducted at the final stage of recruitment through our third-party provider, Checkr.
*Please note that Cast AI does not provide any form of visa sponsorship/work permit.
#LI-Remote

Ready to apply?

Apply to Cast AI

Cast AI

View all jobs →

Senior Software Engineer - Infrastructure

Cast AI · Bulgaria; Croatia; Estonia; Greece; Hungary; Latvia; Lithuania; Poland; Romania; Slovakia; Slovenia; Ukraine

Apply now

Technology European Union Posted May 7, 2026

Why Cast AI?

Cast AI is an automation platform that operates cloud-native and AI infrastructure at scale. By embedding autonomous decision-making directly into Kubernetes and cloud environments, Cast AI continuously optimizes performance, reliability, and efficiency in production.

The old way doesn't work. As Kubernetes and AI environments grow, manual decisions don’t. Cast AI replaces tickets, alerts, and manual tuning with continuous automation that adapts infrastructure as conditions change. Efficiency and cost savings follow naturally from that automation.

Over 2,100 companies already rely on Cast AI, including Akamai, BMW, Cisco, FICO, HuggingFace, NielsenIQ, Swisscom, and TGS.

Global team, diverse perspectives
We're headquartered in Miami, but our impact is international. We take a global and intentional approach to diversity. Today, Cast AI operates across 34 countries spanning Europe, North America, Latin America, and APAC, bringing a wide range of perspectives into how we build and lead.

Unicorn momentum
In January 2026, we achieved unicorn status with a strategic investment from Pacific Alliance Ventures, the corporate venture arm of Shinsegae Group (a $50+ billion Korean conglomerate). Our valuation now exceeds $1 billion, and we're just getting started.

Join us as we build the future of autonomous infrastructure.

This is a location-specific opportunity. We are currently accepting applications from candidates residing in the following European countries: Bulgaria, Croatia, Estonia, Greece, Hungary, Latvia, Lithuania, Poland, Romania, Slovakia, Slovenia, and Ukraine.

About the role

You'll join one of several teams building the low-level systems behind CAST AI's multi-cloud automation platform. The work sits at the intersection of Kubernetes, cloud infrastructure, and Linux.

Depending on your strengths, you'll work in one of these areas:

Cluster and node lifecycle - custom node bootstrapping across AWS, GCP, and Azure. Controllers and operators managing full node lifecycle. Bridging the gap between autoscaler decisions and running VMs - including working around limitations of managed Kubernetes platforms like EKS, GKE, and AKS.
Cross-cloud provisioning - orchestrating infrastructure across regions and clouds. Workflow orchestration, provisioning integrations, networking and identity across cloud boundaries.
Live container migration - zero-downtime relocation of running containers using CRIU checkpoint/restore, custom CRI-O and runc forks, and forked cloud CNI plugins. Memory page transfer, TCP session migration, process-level checkpoint/restore. Kernel-level Linux engineering applied to Kubernetes.
Storage optimization - dynamic volume rightsizing, BtrFS compression, custom CSI drivers. Full storage stack from cloud disk APIs through filesystem internals to Kubernetes PV/PVC lifecycle.

Common across all teams: Go, deep Kubernetes, multi-cloud, building beyond what the ecosystem offers.

How we build

We invest heavily in agentic development and AI-powered tooling. Engineers work with code agents and automated workflows daily. We expect you to shape how these tools evolve.

Requirements:

Strong backend fundamentals and production-grade coding, preferably Go.
Hands-on Kubernetes internals, cloud infrastructure, or large-scale distributed systems experience.
Clear English communication, collaborative and self-directed.
Bias toward action and ownership.

What would make you stand out

Deep knowledge of EKS, GKE, or AKS internals.
Low-level Linux experience: process internals, networking (NAT, iptables, conntrack, SDN), filesystems, storage.
Kubernetes or cloud-native OSS contributions.
Cloud storage depth - block storage, CSI, volume management, filesystem internals.
Container runtime experience (CRI-O, runc, containerd) or CRIU.
Experience with AI coding agents and agentic development workflows.
Comfort building where there's no playbook.

Responsibilities:

Design, build, and operate backend services, Kubernetes operators, and controllers in Go across AWS, GCP, and Azure.
Solve infrastructure-layer problems — networking, scheduling, storage, compute, or container runtime internals.
Build and maintain high-throughput services (gRPC/REST) and async orchestration workflows.
Shape public and internal APIs.
Leverage and improve AI-powered development workflows.
Participate in design reviews, code reviews, on-call, and customer deep-dives.

What’s in it for you?

Competitive salary (€6,500 - €9,000 gross, depending on the level of experience)
Enjoy a flexible, remote-first global environment.
Collaborate with a global team of cloud experts and innovators, passionate about pushing the boundaries of Kubernetes technology.
Equity options.
Get quick feedback with a fast-paced workflow. Most feature projects are completed in 1 to 4 weeks.
Spend 10% of your work time on personal projects or self-improvement.
Learning budget for professional and personal development - including access to international conferences and courses that elevate your skills.
Annual hackathon to spark new ideas and strengthen team bonds.
Team-building budget and company events to connect with your colleagues.
Equipment budget to ensure you have everything you need.
Extra days off to help maintain a healthy work-life balance.

Hiring process

Screening call with Recruiter
Hiring Manager interview
Technical interview (system design)
Live coding
Culture Check interview with an executive

*As part of our standard hiring process, we would like to inform you that a background check may be conducted at the final stage of recruitment through our third-party provider, Checkr.
*Please note that Cast AI does not provide any form of visa sponsorship/work permit.

#LI-Remote

Ready to apply?

Apply to Cast AI

Cast AI

View all jobs →

Senior AI Engineer - Harness Engineering (Kimchi)

Cast AI · Bulgaria; Croatia; Estonia; Greece; Hungary; Latvia; Lithuania; Poland; Romania; Slovakia; Slovenia; Ukraine

Apply now

Technology European Union Posted May 7, 2026

Why Kimchi?

Kimchi is the AI platform inside CAST AI. We started by helping companies run LLMs on their own Kubernetes clusters - now we're building the execution layer where agents do real work.

Our Infrastructure today: multi-model inference (MiniMax, Kimi, GLM-5, Nemotron, DeepSeek) with intelligent routing, an OpenAI-compatible API, and deployment flexibility from our GPUs to your VPC. The inference layer is the foundation. What we're hiring for sits on top of it: coding agents, agent runtimes, orchestration systems, and the reliability engineering that makes them actually finish things.

Tech Stack: TypeScript, Go, Kubernetes, AWS/GCP/Azure, MCP, Prometheus/Grafana/Loki, GitLab CI, ArgoCD.

Why harness engineering matters here
OpenAI and Anthropic ship models. They also ship one harness each - the scaffolding that turns a raw model into something that can plan, execute, recover, and complete work. We ship a different kind of harness: one built for cost-conscious, long-horizon autonomy, running on inference infrastructure we control end-to-end.
A decent model with a great harness beats a great model with a bad harness. We've watched this play out. The gap between what today's models can do and what you see them doing is largely a harness gap - and that gap is where we operate.

What you'll build
The ratchet.
Every time our agent makes a mistake, we engineer a solution so it never makes that mistake again. That means hooks that enforce constraints the model "knows" but forgets: pre-commit lint checks, permission gates, context compaction before the window fills. Success is silent, failures are verbose.

Long-horizon execution.
Our harness is built around spec-driven autonomy: meta-prompting, fresh context per task, worktree-per-slice git strategy, automatic replanning, crash recovery, stuck detection. We're implementing Ralph loops - when the model tries to exit, we intercept and reinject the goal into a fresh context. The agent reads state from disk and continues. Multi-session, multi-day work, without context rot.

Planner/executor splits.
Planning with a reasoning model, executing with a fast one, evaluating with a third. Separating generation from evaluation beats self-verification because agents reliably skew positive when grading their own work.

The harness surface.
CLI, TUI, MCP integration, sandboxed execution, telemetry. Our AGENTS.md is short - every line traces to a specific thing that went wrong. TypeScript on the surface, Go where it matters.

Memory and context.
Moving agents off laptops, giving them state that survives across sessions, managing context so information lands where it's actionable. Compaction, tool-call offloading, progressive skill disclosure.

What makes this different (with receipts)
You've seen the pitch: "we route to the best model." Everyone says that. Here's what we actually have:

GPU infrastructure we own. Not just an API reseller. From GPU placement across clouds to the inference endpoint your agent calls - we control the cost curve.
A harness-first thesis. We treat agent failures as configuration problems, not model problems. When we moved from a stock harness to our own, completion rates on internal benchmarks improved by 40%+ on the same model.
Agents.md that earns every line. No brainstormed rules - every constraint in our system prompt traces to a real failure we saw and fixed.

Requirements:

You've used AI coding agents in anger. Not demos - real work. You have opinions about Claude Code, Codex, OpenCode, Cursor. You know what it feels like when an agent gets stuck and why.
Strong TypeScript or Go in production. Comfort moving between them. Our surface is TypeScript; our core is Go.
You think in harness terms. You read "the agent hallucinated" and your first instinct is to ask what context it was missing, what hook should have caught it, what constraint should exist.
You drive features end-to-end. Design → build → ship → measure → iterate. We don't have layers that absorb ambiguity for you.

Responsibilities:

Build and evolve the agent harness - ship hooks, permission gates, and context compaction. Every AGENTS.md constraint traces to a failure you personally diagnosed.
Own long-horizon execution - multi-session task completion via spec-driven prompting, worktree-per-slice git, Ralph loop recovery, and stuck detection. Completion rate is your metric.
Architect planner/executor/evaluator pipelines - planning with a reasoning model, execution with a fast one, evaluation with a third. No self-verification.
Manage agent memory and context - state persistence across sessions, context compaction, tool-call offloading. Zero context rot on multi-day work.
Own the harness surface - CLI, TUI, MCP integrations, sandboxed execution, telemetry. TypeScript on the surface, Go where it matters.

What success looks like (after 6 months):

You've shipped at least one major harness feature end-to-end: designed it, built it, measured it, iterated.
You've added constraints to our AGENTS.md based on failures you personally observed and diagnosed.
You've improved a measurable reliability metric - completion rate, context efficiency, or cost per successful task.
You've formed strong opinions about where our harness is load-bearing and where it's dead weight.

What’s in it for you?

Competitive salary (€6,500 - €9,000 gross, depending on the level of experience).
Enjoy a flexible, remote-first global environment.
Collaborate with a global team of cloud experts and innovators, passionate about pushing the boundaries of Kubernetes technology
Equity options.
Get quick feedback with a fast-paced workflow. Most feature projects are completed in 1 to 4 weeks.
Spend 10% of your work time on personal projects or self-improvement.
Learning budget for professional and personal development - including access to international conferences and courses that elevate your skills.
Annual hackathon to spark new ideas and strengthen team bonds.
Team-building budget and company events to connect with your colleagues.
Equipment budget to ensure you have everything you need.
Extra days off to help maintain a healthy work-life balance.

This is a location-specific opportunity. We are currently accepting applications from candidates residing in the following European countries: Bulgaria, Croatia, Estonia, Greece, Hungary, Latvia, Lithuania, Poland, Romania, Slovakia, Slovenia, and Ukraine.

*As part of our standard hiring process, we would like to inform you that a background check may be conducted at the final stage of recruitment through our third-party provider, Checkr.
*Please note that Cast AI does not provide any form of visa sponsorship/work permit.

#LI-Remote

Ready to apply?

Apply to Cast AI

Cast AI

View all jobs →

Senior Software Engineer

Cast AI · Bulgaria; Croatia; Estonia; Greece; Hungary; Latvia; Lithuania; Poland; Romania; Slovakia; Slovenia; Ukraine

Apply now

Technology European Union Posted May 7, 2026

Why Cast AI?

Cast AI is an automation platform that operates cloud-native and AI infrastructure at scale. By embedding autonomous decision-making directly into Kubernetes and cloud environments, Cast AI continuously optimizes performance, reliability, and efficiency in production.

The old way doesn't work. As Kubernetes and AI environments grow, manual decisions don’t. Cast AI replaces tickets, alerts, and manual tuning with continuous automation that adapts infrastructure as conditions change. Efficiency and cost savings follow naturally from that automation.

Over 2,100 companies already rely on Cast AI, including Akamai, BMW, Cisco, FICO, HuggingFace, NielsenIQ, Swisscom, and TGS.

Global team, diverse perspectives
We're headquartered in Miami, but our impact is international. We take a global and intentional approach to diversity. Today, Cast AI operates across 34 countries spanning Europe, North America, Latin America, and APAC, bringing a wide range of perspectives into how we build and lead.

Unicorn momentum
In January 2026, we achieved unicorn status with a strategic investment from Pacific Alliance Ventures, the corporate venture arm of Shinsegae Group (a $50+ billion Korean conglomerate). Our valuation now exceeds $1 billion, and we're just getting started.

Join us as we build the future of autonomous infrastructure.

This is a location-specific opportunity. We are currently accepting applications from candidates residing in the following European countries: Bulgaria, Croatia, Estonia, Greece, Hungary, Latvia, Lithuania, Poland, Romania, Slovakia, Slovenia, and Ukraine.

We are hiring across multiple teams!

As a Senior Software Engineer, you will have the opportunity to work on different key features of our product. We are currently hiring Senior Software Engineers for the following teams:

- Workload Optimization - Automates workload resource management by dynamically adjusting resource allocations, helping developers significantly reduce costs and improve application reliability.

- Karpenter - The Karpenter team powers the integration between Karpenter and Cast AI, bringing enterprise capabilities to the most popular open source Kubernetes autoscaler. We enhance Karpenter with advanced features that improve application reliability and performance while optimizing costs. By joining the team, you’ll bridge open source innovation with enterprise requirements, directly impacting how organizations run Karpenter at scale.

- Reporting - Builds a scalable reporting system that ingests millions of rows per second into our time-series databases, providing insights into cost savings, workload efficiencies, and Cast AI automation impact.

- Pricing - Drives the synchronization of public and customer cloud resources, availability, and dynamic pricing across all major cloud providers. Empowers autoscaling by leveraging discounts, commitments, and cross-cluster tracking to maximize savings. Provides a reliable source of truth for node pricing, resources, components, discounts, and commitments.

- Autoscaler - Automates Kubernetes node autoscaling to optimize clusters, balance workloads, remove underutilized nodes, and dynamically allocate capacity in real-time, thereby reducing cluster costs by half.

- Identity - Builds and maintains the trust and access foundation for the entire platform, ensuring every user, service, and workload authenticates and interacts securely and seamlessly at scale.

- Billy - Powers the critical day-2 operations layer of the platform - from billing and audit trails to notifications and feature flags - ensuring the platform runs reliably, transparently, and at scale for every customer, every day.

Here are some of the tools we use daily:

Programming Languages: Go
Cloud & Orchestration: Kubernetes, AWS, GCP, Azure
Databases & Storage: PostgreSQL, Cloud Object Storage
Messaging & APIs: GCP Pub/Sub, gRPC for internal communication, REST for public APIs
Observability: Prometheus, Grafana, Loki, Tempo
CI/CD & GitOps: GitLab CI with ArgoCD.

Requirements:

Production experience with Go is strongly preferred; candidates without Go should demonstrate strong systems programming skills in a comparable language.
Deep hands-on experience with cloud platforms (AWS, GCP, or Azure) - including real understanding of how compute, networking, and storage work under the hood.
Understanding of Kubernetes internals - autoscaling and networking.
You've personally driven a complex project end-to-end.
Strong debugging, optimization, and performance-tuning skills - including query profiling, index design, and database performance tuning beyond ORM usage.
You've run observability tooling (Prometheus, Grafana, OpenTelemetry) in production.
CI/CD and DevOps practices experience.
Strong English skills, both verbal and written.
Startup mindset: adaptable, proactive, and comfortable with ambiguity.

Responsibilities:

Design and build distributed systems that operate Kubernetes infrastructure autonomously at scale.
Write production Go services that interact with AWS, GCP, and Azure APIs for real-time cloud resource management.
Own features end-to-end: from design through implementation, testing, and production rollout (most projects ship in 1-4 weeks).
Debug complex production issues across cloud providers, Kubernetes clusters, and distributed services.
Collaborate with product and other engineering teams to solve problems that don't have textbook solutions.
Work with time-series data, cloud provider APIs, and Kubernetes control plane internals.

What’s in it for you?

Competitive salary (€6,500 - €9,000 gross, depending on the level of experience)
Enjoy a flexible, remote-first global environment.
Collaborate with a global team of cloud experts and innovators, passionate about pushing the boundaries of Kubernetes technology.
Equity options.
Get quick feedback with a fast-paced workflow. Most feature projects are completed in 1 to 4 weeks.
Spend 10% of your work time on personal projects or self-improvement.
Learning budget for professional and personal development - including access to international conferences and courses that elevate your skills.
Annual hackathon to spark new ideas and strengthen team bonds.
Team-building budget and company events to connect with your colleagues.
Equipment budget to ensure you have everything you need.
Extra days off to help maintain a healthy work-life balance.

Hiring process

Screening call with Recruiter
Hiring Manager interview
Technical interview (system design)
Live coding
Culture Check interview with an executive

*As part of our standard hiring process, we would like to inform you that a background check may be conducted at the final stage of recruitment through our third-party provider, Checkr.
*Please note that Cast AI does not provide any form of visa sponsorship/work permit.

#LI-Remote

Ready to apply?

Apply to Cast AI

Cast AI

View all jobs →

Senior Product Marketing Manager - Core

Cast AI · European Union

Apply now

Marketing European Union Posted Apr 23, 2026

Why Cast AI?

Cast AI is an automation platform that operates cloud-native and AI infrastructure at scale. By embedding autonomous decision-making directly into Kubernetes and cloud environments, Cast AI continuously optimizes performance, reliability, and efficiency in production.
The old way doesn't work. As Kubernetes and AI environments grow, manual decisions don’t. Cast AI replaces tickets, alerts, and manual tuning with continuous automation that adapts infrastructure as conditions change. Efficiency and cost savings follow naturally from that automation.
Over 2,100 companies already rely on Cast AI, including Akamai, BMW, Cisco, FICO, HuggingFace, NielsenIQ, Swisscom, and TGS.

Global team, diverse perspectives

We're headquartered in Miami, but our impact is international. We take a global and intentional approach to diversity. Today, Cast AI operates across 34 countries spanning Europe, North America, Latin America, and APAC, bringing a wide range of perspectives into how we build and lead.

Unicorn momentum

In January 2026, we achieved unicorn status with a strategic investment from Pacific Alliance Ventures, the corporate venture arm of Shinsegae Group (a $50+ billion Korean conglomerate). Our valuation now exceeds $1 billion, and we're just getting started.

Join us as we build the future of autonomous infrastructure.

About the role

As Product Marketing Manager for Cast AI’s Core Kubernetes Optimization Platform, you’ll lead the go-to-market strategy for our flagship offering. You’ll translate powerful technology into clear value for technical audiences, helping DevOps, SREs, and platform engineers understand how Cast AI automatically improves performance, resilience, and cost efficiency.

This is a highly technical and strategic role where you’ll act as the voice of the product in the market and the voice of the customer internally.

Requirements:

Strong understanding of Kubernetes, cloud infrastructure, and the container ecosystem.
Familiarity with AI infrastructure, data platforms, or developer workflows is a plus.
5+ years in product marketing, technical product management, or solutions engineering.
Track record of working on or launching early-stage B2B infrastructure or developer-focused products.
Experience working in a SaaS company.
Comfortable working independently and building go-to-market foundations from scratch.
Passion for exploring new markets and helping new products find product-market fit.
Strong curiosity and a hands-on mindset, with the drive to test, learn, and iterate.
Comfortable navigating ambiguity and turning chaos into clarity.
BS in Computer Science, Engineering, or a related technical field (or equivalent experience).
English level: Fluent (both written and spoken).

Responsibilities:

Own the messaging, positioning, and value narrative for Cast AI’s core platform.
Define and execute GTM plans for new features and major releases.
Partner closely with Product to translate roadmap into customer-facing narratives and enablement content.
Collaborate with Sales, Customer Success, and Solutions Engineering to enable the field.
Build technical assets (pitch decks, one-pagers, demos, blogs, webinars, technical briefs) that speak directly to hands-on practitioners.
Track launch success and product adoption metrics; continuously optimize GTM strategy.

What’s in it for you?

Competitive salary (depending on the level of experience)
Enjoy a flexible, remote-first global environment.
Collaborate with a global team of cloud experts and innovators, passionate about pushing the boundaries of Kubernetes technology.
Equity options.
Get quick feedback with a fast-paced workflow. Most feature projects are completed in 1 to 4 weeks.
Spend 10% of your work time on personal projects or self-improvement.
Learning budget for professional and personal development - including access to international conferences and courses that elevate your skills.
Team-building budget and company events to connect with your colleagues.
Equipment budget to ensure you have everything you need.
Extra days off to help maintain a healthy work-life balance.

Hiring process

Screening call with Recruiter
Hiring Manager interview
1-2 additional interviews based on the role
Culture Check interview with an executive

*As part of our standard hiring process, we would like to inform you that a background check may be conducted at the final stage of recruitment through our third-party provider, Checkr.
*Please note that Cast AI does not provide any form of visa sponsorship/work permit.

#LI-Remote

Ready to apply?

Apply to Cast AI

Cast AI

View all jobs →

Senior Software Engineer – Database Optimizer

Cast AI · Bulgaria; Croatia; Greece; Lithuania; Poland; Romania; Slovakia

Apply now

Technology European Union Posted Apr 23, 2026

Why Cast AI?

Cast AI is an automation platform that operates cloud-native and AI infrastructure at scale. By embedding autonomous decision-making directly into Kubernetes and cloud environments, Cast AI continuously optimizes performance, reliability, and efficiency in production.

The old way doesn't work. As Kubernetes and AI environments grow, manual decisions don’t. Cast AI replaces tickets, alerts, and manual tuning with continuous automation that adapts infrastructure as conditions change. Efficiency and cost savings follow naturally from that automation.

Over 2,100 companies already rely on Cast AI, including Akamai, BMW, Cisco, FICO, HuggingFace, NielsenIQ, Swisscom, and TGS.

Global team, diverse perspectives
We're headquartered in Miami, but our impact is international. We take a global and intentional approach to diversity. Today, Cast AI operates across 34 countries spanning Europe, North America, Latin America, and APAC, bringing a wide range of perspectives into how we build and lead.

Unicorn momentum
In January 2026, we achieved unicorn status with a strategic investment from Pacific Alliance Ventures, the corporate venture arm of Shinsegae Group (a $50+ billion Korean conglomerate). Our valuation now exceeds $1 billion, and we're just getting started.

Join us as we build the future of autonomous infrastructure.

About the role

Databases are notoriously hard to optimize—with 350+ configuration parameters in PostgreSQL alone, even experienced engineers struggle to find the right settings. Have you ever dreamed about a database that continuously self-optimizes itself in the background, so you can focus on what really matters: building great products?

Sometimes your queries slow down and it's not immediately clear why. Imagine a system that proactively detects performance degradation, analyzes the root cause, and automatically improves your queries and indexes for you—creating pull requests with the exact optimizations needed.

CAST AI is seeking a skilled senior software engineer with expertise in databases and interest in understanding the underlying technology that makes them work. Knowledge of implementation details around transactions, wire protocols, WAL, indexes, query planning, and other fundamental database features will enable you to build a product that automates what's traditionally been manual and complex.

You'll work on a greenfield project with high impact, creating intelligent systems that make database optimization effortless for our customers.

This is a location-specific opportunity. We are currently accepting applications from candidates residing in the following European countries: Bulgaria, Croatia, Estonia, Greece, Hungary, Latvia, Lithuania, Poland, Romania, Slovakia, Slovenia, and Ukraine.

Requirements:

Experience working with and troubleshooting PostgreSQL and/or MySQL in production environments
Experience troubleshooting database-related issues in one of your previous roles—whether as a DBA, backend engineer, SRE, or any position where you've debugged slow queries, optimized indexes, or resolved production database problems
Strong programming skills in Go and/or C++
Strong problem-solving skills and the ability to troubleshoot complex issues in a production environment
Strong written and verbal communication skills in English
Ability to work independently and collaboratively within a team
Startup mindset: adaptable, proactive, and comfortable with ambiguity
A proactive, problem-solving mindset with a "yes we can" attitude

Here are some of the tools we use daily:

Mostly Golang with C++ in the most critical data pathways
Extending Envoy with custom network filters
ClickHouse and PostgreSQL for business metrics
GRPC and REST APIs
GitLab CI with ArgoCD as our GitOps CD engine
Prometheus, Grafana, Loki, and Tempo for observability
Deployment charts are written in Helm.

Responsibilities:

Design, build, and operate database optimization services in Go and C++.
Develop features for query analysis, index recommendations, and automated performance tuning that deliver measurable database cost savings.
Extend and maintain integrations with PostgreSQL and MySQL protocols, cloud database services (RDS, Cloud SQL, Azure Database).
Implement query parsing, plan analysis, and recommendation algorithms that identify optimization opportunities.
Develop monitoring and alerting systems for database performance metrics, connection health, and cost tracking.
Shape public and internal APIs with an eye for simplicity and developer experience.
Participate in design reviews, code reviews, and occasional customer deep‑dives.

What’s in it for you?

Competitive salary (€6,500 - €9,000 gross, depending on the level of experience)
Enjoy a flexible, remote-first global environment.
Collaborate with a global team of cloud experts and innovators, passionate about pushing the boundaries of Kubernetes technology.
Equity options.
Private health insurance.
Get quick feedback with a fast-paced workflow. Most feature projects are completed in 1 to 4 weeks.
Spend 10% of your work time on personal projects or self-improvement.
Learning budget for professional and personal development - including access to international conferences and courses that elevate your skills.
Annual hackathon to spark new ideas and strengthen team bonds.
Team-building budget and company events to connect with your colleagues.
Equipment budget to ensure you have everything you need.
Extra days off to help maintain a healthy work-life balance.

#LI-Remote

Ready to apply?

Apply to Cast AI

Cast AI

View all jobs →