About the role
About Crunchyroll
Founded by fans, Crunchyroll delivers the art and culture of anime to a passionate community. We super-serve over 100 million anime and manga fans across 200+ countries and territories, and help them connect with the stories and characters they crave. Whether that experience is online or in-person, streaming video, theatrical, games, merchandise, events and more, it’s powered by the anime content we all love.
Join our team, and help us shape the future of anime!
Crunchyroll is growing and changing, presenting unique challenges and opportunities to support millions of anime fans around the world. The Database Operations Engineering team provides a seamless infrastructure foundation to our internal stakeholders, ensuring an exceptional experience for all Crunchyroll fans.
As a Senior Database Reliability Engineer, you will be primarily responsible for operating, improving, and maintaining the reliability and operational excellence of our data infrastructure. Your core focus will be to introduce robust best practices for database production support, strengthen our global on-call rotation, and design and build reusable, database-specific Infrastructure as Code (IaC) components to ensure high availability, scalability, and 100% automation.
Key Areas of Responsibility
- Database Operational Excellence & Production Support: Drive, stabilize, and own 24x7 database production support operations, processes, and incident remediation. Responsibly track database alerts, establish clear operational procedures, and bring infrastructure alerts to rapid closure.
- Database Infrastructure as Code (IaC) & Automation: Architect, implement, and maintain reusable database-specific IaC components and configurations using frameworks like Terraform, CloudFormation or Pulumi. Standardize configurations across multiple datastores to enable automated infrastructure deployment, sizing, and posture management.
- Core Configuration Management: Proactively enable and standardize mission-critical database attributes and configurations by default, including automated backups, failover strategies, timeouts, and lifecycle policies.
- On-Call & Platform Reliability: Strengthen and actively participate in the database on-call rotation, identifying SLAs, system vulnerabilities, and operational gaps to eliminate Single Points of Failure (SPOF).
- Database SRE & Site Operations: Manage large-scale data infrastructures, execute cluster management, capacity planning, data governance, compliance reviews, and handle complex data store migrations (such as MariaDB to Aurora/DynamoDB) and major version upgrades safely during non-US low traffic hours.
- Collaborative Growth & Development: Work alongside a seasoned team of database engineering specialists (leveraging existing senior architectural depth on the team) to systematically scale platform features while executing a continuous learning roadmap to expand personal depth in native AWS database services (RDS/Aurora) and complex SQL tuning.
About You
We get excited about candidates, like you, because you possess:
- Bachelor's degree in Computer Science, Information Technology, or a related field.
- 8+ years of experience in database operations, site reliability engineering (SRE), or a related role with a heavy focus on data platforms and core operational infrastructure.
- Strong proficiency in Automation and IaC frameworks, with extensive hands-on experience in building database IaC
- Proven track record in Database Production Support and Operations, with deep practical experience managing robust, highly available 24x7 runtime systems (prior experience handling large-scale database production support at scale is highly valued).
- Extensive experience with the AWS cloud platform and hands-on implementation of CI/CD pipelines and DatabaseOps workflows.
- Proficiency in monitoring and observability tools (e.g., Datadog, CloudWatch, DevOps Guru, Database Performance Insights) to track metrics, latency, throughput, and system errors.
- Strong understanding of various system performance metrics at a low level (such as Disk/IO saturation) and experience identifying or eliminating operational bottlenecks.
- Familiarity with managing large-scale database structures across various systems (SQL and NoSQL).
- Strong problem-solving skills, ownership mentality, proactive communication skills, and a baseline capability to document clear incident response playbooks and operational requirements.
About the Team
The Database Operations Engineering team is dedicated to ensuring the reliability, scalability, and performance of our data infrastructure. We focus on standardizing and implementing monitoring and alerting across all datastores to track key metrics like errors, latency, and throughput, and to ensure critical systems are covered. Our team leads horizontal efforts to keep databases up-to-date, implements Infrastructure as Code (IaC) for high availability and performance, and automates key processes to enhance operational efficiency.
We lead and evangelize the principle of 100% automation. Additionally, we define and document operational requirements, develop incident response processes, and automate monitoring and compliance checks to maintain a secure and reliable data environment. By continuously improving load testing and optimizing data governance practices, we support the overall health and efficiency of our data systems.
Why you will love working at Crunchyroll
In addition to getting to work with fun, passionate and inspired colleagues, you will also enjoy the following benefits and perks:
- Receive a great compensation package including salary plus performance bonus earning potential, paid annually.
- Flexible time off policies allowing you to take the time you need to be your whole self.
- Generous medical, dental, vision, STD, LTD, and life insurance
- Health Saving Account HSA program
- Health care and dependent care FSA
- 401(k) plan, with employer match
- Employer paid commuter benefit
- Support program for new parents
- Pet insurance and some of our offices are pet friendly!
#LifeAtCrunchyroll ((select from the following job modalities for this role: #LI-Hybrid #LI-remote #LI-onsite))
About our Values
We want to be everything for someone rather than something for everyone and we do this by living and modeling our values in all that we do. We value
-
Courage. We believe that when we overcome fear, we enable our best selves.
-
Curiosity. We are curious, which is the gateway to empathy, inclusion, and understanding.
- Kaizen. We have a growth mindset committed to constant forward progress.
-
Service. We serve our community with humility, enabling joy and belonging for others.
Our commitment to diversity and inclusion
Our mission of helping people belong reflects our commitment to diversity & inclusion. It's just the way we do business.
We are an equal opportunity employer and value diversity at Crunchyroll. Pursuant to applicable law, we do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
Crunchyroll, LLC is an independently operated joint venture between US-based Sony Pictures Entertainment, and Japan's Aniplex, a subsidiary of Sony Music Entertainment (Japan) Inc., both subsidiaries of Tokyo-based Sony Group Corporation.
Questions about Crunchyroll’s hiring process? Please check out our Hiring FAQs: https://help.crunchyroll.com/hc/en-us/articles/360040471712-Crunchyroll-Hiring-FAQs
Please refer to our Candidate Privacy Policy for more information about how we process your personal information, and your data protection rights: https://tbcdn.talentbrew.com/company/22978/v1_0/docs/spe-jobs-privacy-policy-update-for-crpa-dec-21-22.pdf
Please beware of recent scams to online job seekers. Those applying to our job openings will only be contacted directly from @crunchyroll.com email account.