Jobs β€Ί Companies β€Ί UniversalAGI β€Ί Machine Learning Infrastructure Engineer

About this Machine Learning Infrastructure Engineer role at UniversalAGI

UniversalAGI Β· San Francisco

πŸ“ San Francisco | Work Directly with CEO & founding team | Report to CEO | OpenAI for Physics | 🏒 5 Days Onsite

Machine Learning Infrastructure Engineer

Location: Onsite in San Francisco

Compensation: Competitive Salary + Equity

Who We Are

UniversalAGI is building OpenAI for Physics. AI startup based in San Francisco and backed by Elad Gil (#1 Solo VC), Eric Schmidt (former Google CEO), Prith Banerjee (ANSYS CTO), Ion Stoica (Databricks Founder), Jared Kushner (former Senior Advisor to the President), David Patterson (Turing Award Winner), and Luis Videgaray (former Foreign and Finance Minister of Mexico). We're building foundation AI models for physics that enable end-to-end industrial automation from initial design through optimization, validation, and production. We're building a high-velocity team of relentless researchers and engineers that will define the next generation of AI for industrial engineering. If you're passionate about AI, physics, or the future of industrial innovation, we want to hear from you.

About the Role

UniversalAGI is hiring an Infrastructure Engineer to build and own the execution platform powering our research and customer deployments: data generation + simulation orchestration + training/fine-tuning infrastructure + benchmarking pipelines + production deployments in customer environments.

You’ll work closely with the CEO and founding team to turn research into repeatable, scalable, reliable systems - internally and in customer infrastructure. This is a β€œship outcomes” role: your work directly determines how fast we can iterate, how reproducible our results are, and how reliably we deliver in production.

What You’ll Do


Build the foundation platform (internal)

  • Build and operate scalable infrastructure for data generation and simulation workflows (job orchestration, scheduling, queues, retries, observability).

  • Build reproducible pipelines for training/fine-tuning and benchmarking (artifact/version management, experiment tracking, dataset lineage).

  • Own cost/performance tradeoffs across compute, storage, networking, and runtime efficiency.

Deploy to customers (external)

  • Lead deployments of our stack into customer cloud/on-prem environments, including secure networking, permissions, and data movement.

  • Build robust deployment patterns: environment provisioning, CI/CD, rollbacks, monitoring, and incident response.

  • Partner with customers to ensure reliability and repeatability under real-world constraints (security, compliance, infra limits, data governance).

Qualifications

  • Strong software engineering skills (clean code, debugging, reliability, reproducibility).

  • Hands-on experience building/operating infrastructure for ML/compute-heavy workflows: pipelines, job orchestration, GPU compute, storage, CI/CD, monitoring.

  • Olympic athlete mindset: You have high standards for yourself and are obsessed with measurable improvement on the metrics you are delivering to customers.

  • Resourcefulness: you know when to do the β€œquick & correct” fix vs. when to invest in a robust solution, and you can justify the tradeoff with impact/

  • Ownership: Comfortable owning work end-to-end and being accountable for measurable outcomes.

Bonus Qualifications

  • Experience with workflow orchestration (e.g., Ray, Kubernetes, Slurm).

  • Experience with GPU infrastructure and distributed training systems.

  • Experience building evaluation/benchmarking frameworks with strong reproducibility guarantees.

  • Experience deploying into regulated / security-sensitive environments (gov/defense/enterprise).

  • Experience with simulation/HPC pipelines (CFD, meshing, batch workloads) is a plus but not required.

  • Experience in an FDE-style / delivery execution role (or similar β€œship results fast” environments).

Cultural Fit

  • Technical Respect: Ability to earn respect through hands-on technical contribution

  • Intensity: Thrives in our unusually intense culture - willing to grind when needed

  • Customer Obsession: Passionate about solving real customer problems, not just publishing papers

  • Deep Work: Values long, uninterrupted periods of focused work over meetings

  • High Availability: Ready to be deeply involved whenever critical issues arise

  • Communication: Can translate complex model decisions to customers and team

  • Growth Mindset: Embraces the compounding returns of intelligence and continuous learning

  • Startup Mindset: Comfortable with ambiguity, rapid change, and wearing multiple hats

  • Work Ethic: Willing to put in the extra hours when needed to hit critical milestones

  • Team Player: Collaborative approach with low ego and high accountability

  • Bias for Action: Ships experiments fast, learns from failures, and iterates quickly

What We Offer

  • Opportunity to define the future of physics AI from the ground up

  • Work on cutting-edge problems at the intersection of deep learning and physics simulation

  • Direct collaboration with the founder & CEO and ability to influence company strategy

  • Competitive compensation with significant equity upside

  • In-person first culture - 5 days a week in office with a team that values face-to-face collaboration

  • Access to world-class investors and advisors in the AI space

Benefits

We provide great benefits, including:

  • Competitive compensation and equity.

  • Competitive health, dental, vision benefits paid by the company.

  • 401(k) plan offering.

  • Flexible vacation.

  • Team Building & Fun Activities.

  • Great scope, ownership and impact.

  • AI tools stipend.

  • Monthly commute stipend.

  • Monthly wellness / fitness stipend.

  • Daily office lunch & dinner covered by the company.

  • Immigration support.

How We’re Different

β€œThe credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood; who strives valiantly; who errs, who comes short again and again... who at the best knows in the end the triumph of high achievement, and who at the worst, if he fails, at least fails while daring greatly." - Teddy Roosevelt

At our core, we believe in being β€œin the arena. ” We are builders, problem solvers, and risk-takers who show up every day ready to put in the work: to sweat, to struggle, and to push past our limits. We know that real progress comes with missteps, iteration, and resilience. We embrace that journey fully knowing that daring greatly is the only way to create something truly meaningful.

If you're ready to train the models that will revolutionize physics simulation, push the boundaries of what AI can learn, and deliver real impact, UniversalAGI is the place for you.

Ready to apply to UniversalAGI?
Apply to UniversalAGI

Similar jobs

Redwood Materials
Infrastructure Software Engineer, Energy Storage
Redwood Materials
⚑ Apply early San Francisco, California, Uni... Onsite $180,000–$237,500
● New πŸ‘ Seen βœ“ Applied 4h ago
Fluidstack
Software Engineer, Cloud Infrastructure
Fluidstack
⚑ Apply early San Francisco, CA Onsite $175,000–$300,000
● New πŸ‘ Seen βœ“ Applied 4h ago
Fluidstack
Platform Engineer
Fluidstack
⚑ Apply early San Francisco, CA Onsite $200,000–$250,000
● New πŸ‘ Seen βœ“ Applied 4h ago
Fluidstack
Software Engineer, GPU Infrastructure
Fluidstack
⚑ Apply early San Francisco, CA Onsite $175,000–$300,000
● New πŸ‘ Seen βœ“ Applied 4h ago
Retell AI
Senior Software Engineer, Infrastructure
Retell AI
⚑ Apply early San Francisco Bay Area Onsite $200,000–$300,000
● New πŸ‘ Seen βœ“ Applied 1d ago
Retell AI
Staff Engineer, Platform & Systems
Retell AI
⚑ Apply early San Francisco Bay Area Onsite $225,000–$350,000
● New πŸ‘ Seen βœ“ Applied 1d ago
CD
Software Engineer, Inference & Platform
Chai Discovery
⚑ Apply early San Francisco office Onsite
● New πŸ‘ Seen βœ“ Applied 2d ago
Atomic Semi
Infrastructure Engineer
Atomic Semi
⚑ Apply early San Francisco Office Onsite $125,000–$195,000
● New πŸ‘ Seen βœ“ Applied 2d ago
Anthropic
Platform Security Engineer, OpenBMC
Anthropic
⚑ Apply early San Francisco, CA | New York C... Onsite $405,000–$405,000
● New πŸ‘ Seen βœ“ Applied 2d ago

Sign up for suggestions tailored to the jobs you open and the searches you save.

Apply now
πŸ€–

Whoa β€” hold up

JobsRadar was built for real people having a rough time in their job search β€” not for automated requests. You're clicking way too fast and you're now temporarily blocked.

Come back later. If you're genuinely job hunting, we've got your back β€” just act like a human.

Catch your next role the second it’s posted.

Create a free account and we’ll watch the boards for you — the instant a job matches your search, it lands in your inbox or Telegram. No digging, no refreshing.

Create free account

Free forever · takes 30 seconds · already have one?

Get an edge on your job hunt.

Join our Telegram channel for the stuff that helps you land the role β€” salary benchmarks, the weekly market pulse, and new-feature drops. No spam, just signal.

Join the channel β€” it's free