Companies Ifm Us HPC Engineer

About the role

Ifm Us
About MBZUAI
The Institute for Foundation Models (IFM) operates some of the world's largest AI supercomputing environments.

Position Summary
This role provides operational coverage during Abu Dhabi overnight hours and serves as a primary point of contact for infrastructure monitoring, incident triage, researcher support, and production operations.

Responsibilities

• Monitor health, performance, and availability of large-scale GPU clusters.
• Respond to incidents and perform first-level triage.
• Support researchers and troubleshoot job failures.
• Execute operational runbooks and recovery procedures.
• Validate cluster deployments, upgrades, and maintenance activities.
• Track infrastructure utilization and operational metrics.
• Develop automation and monitoring tools.
• Contribute to documentation and reporting.

Education

Bachelor's degree in Computer Science, Computer Engineering, Software Engineering, Information Technology, Electrical Engineering, Mathematics, Physics, or related disciplines.

Experience

• 2+ years in Linux systems administration, SRE, DevOps, cloud operations, HPC, or infrastructure operations.
• Strong Linux troubleshooting skills.
• Experience with scripting using Python or Bash.

Preferred Qualifications

• Slurm.
• GPU infrastructure.
• AWS, Azure, or GCP.
• Grafana, Prometheus, Datadog, or similar tools.
• Containers and Kubernetes.
• AI/ML infrastructure exposure.
• Research computing environments.

Benefits Include
*Comprehensive medical, dental, and vision benefits 
 *Bonus
*401K Plan
*Generous paid time off, sick leave and holidays
*Paid Parental Leave
*Employee Assistance Program
*Life insurance and disability
 
Ready to apply to Ifm Us?
Apply to Ifm Us

Similar jobs

Sign up for suggestions tailored to the jobs you open and the searches you save.

Apply now
🤖

Whoa — hold up

JobsRadar was built for real people having a rough time in their job search — not for automated requests. You're clicking way too fast and you're now temporarily blocked.

Come back later. If you're genuinely job hunting, we've got your back — just act like a human.

Catch your next role the second it’s posted.

Create a free account and we’ll watch the boards for you — the instant a job matches your search, it lands in your inbox or Telegram. No digging, no refreshing.

Create free account

Free forever · takes 30 seconds · already have one?

Get the worldwide-remote edge.

Join our Telegram channel for the stuff that helps you land the role — salary benchmarks, the weekly market pulse, and new-feature drops. No spam, just signal.

Join the channel — it's free