Companies Vizcom Senior Platform & Reliability Engineer (SRE)

About the role

Vizcom · Hybrid

Agency Notice: We are not currently working with recruiting agencies for this role. Please do not contact Vizcom employees regarding this position. Any resumes submitted without a prior agreement will be considered unsolicited.

About Vizcom

Vizcom is a visual creation platform that combines modern web tooling with AI-powered workflows. Our stack includes React/TypeScript frontend, Node/Koa + PostGraphile API services, PostgreSQL, Redis, BullMQ queues, and Kubernetes-based production infrastructure.

We’re hiring a senior owner of stability and infrastructure to ensure the platform is reliable, fast, and resilient as we scale.

Role Mission

Own service reliability end-to-end: prevent incidents, reduce blast radius when failures happen, and lead fast, high-quality recovery when production degrades.

This is a hands-on technical leadership role with authority to set reliability standards and enforce production guardrails.

Compensation

$200,000 – $250,000 base salary + meaningful equity


What You’ll Own

  • Reliability bar: Set and enforce SLIs/SLOs/error budgets for critical user flows.

  • Production architecture resilience: Drive failure isolation across API, workers, queues, and dependencies so one subsystem cannot take down core access.

  • Kubernetes runtime reliability: Define probe contracts, rollout/rollback standards, graceful shutdown behavior, scaling/resource policies, and startup safety.

  • Queue + job safety (BullMQ/Redis): Own poison pill containment and workload isolation.

  • Incident command quality: Lead Sev1/Sev2 response end-to-end (containment, communications, technical direction, RCA, corrective action execution).

  • Reliability operating system: Own observability quality (signals over noise), on-call effectiveness, runbooks, and postmortem discipline.

  • Release safety authority: Gate risky deploys and enforce reliability guardrails when production health is at risk.

Traits We’re Looking For

  • Calm, structured incident commander under pressure.

  • Thinks in failure modes and blast radius by default.

  • Pragmatic: can stabilize quickly, then implement durable fixes.

  • High ownership and strong written communication.

First 90 Days

  • Establish baseline reliability metrics and identify top platform risks.

  • Tighten incident response mechanics (roles, comms cadence, runbooks, status updates).

  • Deliver high-impact hardening fixes across probes/startup paths/queue safety.

  • Publish a prioritized 6–12 month reliability roadmap with clear ownership and milestones.

If possible please include one incident you personally led and send to Jordan@vizcom.com :

1) what failed,

2) how you contained it,

3) what permanent fixes you shipped, and measured.

Ready to apply to Vizcom?
Apply to Vizcom

Similar jobs

Stellar Development Foundation
Director, Site Reliability Engineering
Stellar Development Foundation
⚡ Apply early San Francisco Hybrid $205,000–$305,000
● New 👁 Seen ✓ Applied 8h ago
MongoDB
Site Reliability Engineer (Senior or Staff), Infrastructure Security
MongoDB
⚡ Apply early Austin; New York City; San Fra... Onsite $127,000–$249,000
● New 👁 Seen ✓ Applied 10h ago
MongoDB
Staff Technical Program Manager, Site Reliability Engineering
MongoDB
⚡ Apply early Atlanta; Boston; Florida; Geor... Onsite $126,000–$248,000
● New 👁 Seen ✓ Applied 10h ago
MongoDB
Senior Site Reliability Engineer, Fleet Management
MongoDB
⚡ Apply early Austin; Boston; Chicago; Denve... Onsite $127,000–$249,000
● New 👁 Seen ✓ Applied 10h ago
Crunchyroll, LLC
Senior Database Reliability Engineer
Crunchyroll, LLC
⚡ Apply early San Francisco, CA, United Stat... Onsite $203,400–$254,200
● New 👁 Seen ✓ Applied 11h ago
Astranis
Senior Electrical Reliability Engineer
Astranis
⚡ Apply early San Francisco Onsite $135,000–$235,000
● New 👁 Seen ✓ Applied 13h ago
Plaud
Senior SRE Engineer - San Francisco
Plaud
⚡ Apply early San Francisco, CA Hybrid
● New 👁 Seen ✓ Applied 1d ago
AI
Senior Site Reliability Engineer - Hiring Sprint
Airbyte
⚡ Apply early San Francisco Hybrid $196,000–$255,000
● New 👁 Seen ✓ Applied 1d ago
OpenAI
Advanced Packaging Reliability Engineer
OpenAI
⚡ Apply early San Francisco Hybrid $266,000–$445,000
● New 👁 Seen ✓ Applied 5d ago

Sign up for suggestions tailored to the jobs you open and the searches you save.

Apply now
🤖

Whoa — hold up

JobsRadar was built for real people having a rough time in their job search — not for automated requests. You're clicking way too fast and you're now temporarily blocked.

Come back later. If you're genuinely job hunting, we've got your back — just act like a human.

Catch your next role the second it’s posted.

Create a free account and we’ll watch the boards for you — the instant a job matches your search, it lands in your inbox or Telegram. No digging, no refreshing.

Create free account

Free forever · takes 30 seconds · already have one?

Get the worldwide-remote edge.

Join our Telegram channel for the stuff that helps you land the role — salary benchmarks, the weekly market pulse, and new-feature drops. No spam, just signal.

Join the channel — it's free