PyTorch Jobs in Toronto.

Faire

View all jobs →

AI Developer/Prompt Engineer

Levio · Toronto

Apply now

Données massives, intelligence artificielle et d'affaires / Big Data, BI & AI 1 - Unités d'affaires / Business Units Posted May 8, 2026

Are you looking to thrive in a stimulating work environment? 

Join Levio, a leader in digital transformation, and take your career to the next level. You will work alongside high-caliber professionals on ambitious, large-scale technology projects, directly embedded in our clients’ environments. At Levio, we value expertise, curiosity, and continuous improvement — and we give you the space to grow. 

About the Role

We are seeking AI Developer /Prompt Engineer to contribute to major projects.

The AI Developer lead the design, development, and deployment of advanced AI-enabled solutions in enterprise and cloud environments. This role combines strong software engineering discipline with deep hands-on experience in machine learning, generative AI, and modern AI development platforms. The role will be remote.

Why Join Levio? 

Work on complex, high impact digital transformation projects 
Collaborate with experienced, multidisciplinary teams 
Continuously develop your technical and professional expertise 
Enjoy flexibility, autonomy, and a strong people first culture 
Be part of an organization that values diversity, inclusion, and innovation

Role and Responsibilities 

Architect, develop, and deploy production-grade AI and generative AI solutions aligned with business outcomes
Lead implementation of NLP, LLM-based applications, predictive models, and intelligent agents
Build and integrate AI services using AWS Bedrock and AWS SageMaker
Apply AI-assisted development practices using GitHub Copilot and Claude Code
Design APIs, microservices, and event-driven integrations
Establish best practices for model lifecycle management, monitoring, security, and cost optimization
Provide technical leadership, code reviews, and mentoring
Ensure AI solutions meet security, privacy, and responsible AI standards

Qualifications and Experience 

Bachelor’s or Master’s degree in Computer Science, Engineering, or related field
6+ years software development experience, including 3+ years delivering AI solutions
Expert proficiency in Python and strong backend development experience
Hands-on experience with TensorFlow, PyTorch, or scikit-learn
Proven experience with AWS SageMaker and AWS Bedrock
Familiarity with GitHub Copilot and Claude Code
Strong cloud-native and microservices architecture experience

Preferred qualifications

Experience with LLM-based and agentic systems
Knowledge of MLOps and AI governance
Experience with vector databases and RAGarchitectures
Consulting or client-facing delivery experience

Compensation (Ontario)

The salary range provided reflects a good faith estimate based on factors such as experience, technical expertise, location, and relevant certifications. Final compensation will be determined according to the specific circumstances of each candidate.

Estimated salary range: $100,000 to $140,000 per year.

This posting is a current hiring need.

Benefits and Work Environment 

Levio offers a comprehensive and flexible benefits package designed to support your professional growth and personal wellbeing, including: 

4 weeks of cumulative vacation starting from day one

Flexible working hours

Professional Development Allowance (PDA) for training, computer equipment, and physical activities

Training tailored to your areas of expertise

Registered Retirement Savings Plan (RRSP) with employer contribution up to 3% of gross salary

Modular group insurance plan

Public transportation or parking reimbursement when required

Referral bonuses

11 statutory holidays

Personal days 
An active social life (5to7 events, social club, healthy snacks, coffee, and more)

Position Details 

Employment type: Full time, permanent

Notice on the Use of Artificial Intelligence in Recruitment

We use AI enabled tools to help sort and review applications based on job related criteria. Final decisions regarding candidate progression are always made by a human recruiter.

Employment Equity 

Levio subscribes to the principle of employment equity and applies an equal access employment program for women, Indigenous peoples, visible minorities, ethnic minorities, and persons with disabilities. 

We value diversity and inclusion and are committed to creating a healthy, accessible, and rewarding work environment that highlights the unique contributions of our employees. Accommodations are available upon request for candidates participating in all aspects of the selection process. 

Ready to apply?

Apply to Levio

Levio

View all jobs →

ML/AI Engineer

Levio · Toronto

Apply now

Données massives, intelligence artificielle et d'affaires / Big Data, BI & AI 1 - Unités d'affaires / Business Units Posted May 7, 2026

Are you looking to thrive in a stimulating work environment? 

Join Levio, a leader in digital transformation, and take your career to the next level. You will work alongside high-caliber professionals on ambitious, large-scale technology projects, directly embedded in our clients’ environments. At Levio, we value expertise, curiosity, and continuous improvement — and we give you the space to grow. 

About the Role

We are seeking ML/AI Engineers to contribute to major projects.

The ML / AI Engineer design, build, deploy, and operate production-grade machine learning and generative AI systems. This role owns the end-to-end ML lifecycle, ensuring models and AI services are scalable, reliable, secure, and deliver measurable business value. The role will be remote.

Why Join Levio? 

Work on complex, high impact digital transformation projects 
Collaborate with experienced, multidisciplinary teams 
Continuously develop your technical and professional expertise 
Enjoy flexibility, autonomy, and a strong people first culture 
Be part of an organization that values diversity, inclusion, and innovation

Role and Responsibilities 

Design, implement, and productionize machine learning and generative AI models
Build training, validation, and inference pipelines for ML and LLM-based solutions
Implement feature engineering, embeddings, and model versioning best practices
Support LLM-based systems, including RAG architectures and inference optimization
Implement CI/CD for ML and LLM workflows, including automated deployment and rollback
Monitor AI systems for performance, drift, bias, and reliability in production
Optimize compute usage, latency, and operational costs across AI workloads
Ensure AI systems comply with security, privacy, and responsible-AI standards
Collaborate with AI Architects, Developers, and Data Engineers across delivery teams
Mentor junior ML/AI engineers and contribute to engineering best practices

Qualifications and Experience 

Bachelor’s or Master’s degree in Computer Science, Engineering, Data Science, or related field
6+ years of experience in engineering roles, with 3+ years in AI/ML positions
Strong proficiency in Python and ML frameworks (PyTorch, TensorFlow, scikit-learn)
Hands-on experience deploying and operating ML models in production
Experience with cloud ML platforms, especially AWS SageMaker
Strong understanding of data pipelines, model lifecycle management, and monitoring

Compensation (Ontario)

The salary range provided reflects a good faith estimate based on factors such as experience, technical expertise, location, and relevant certifications. Final compensation will be determined according to the specific circumstances of each candidate.

Estimated salary range: $110,000 to $150,000 per year.

This posting is a current hiring need.

Benefits and Work Environment 

Levio offers a comprehensive and flexible benefits package designed to support your professional growth and personal wellbeing, including: 

4 weeks of cumulative vacation starting from day one

Flexible working hours

Professional Development Allowance (PDA) for training, computer equipment, and physical activities

Training tailored to your areas of expertise

Registered Retirement Savings Plan (RRSP) with employer contribution up to 3% of gross salary

Modular group insurance plan

Public transportation or parking reimbursement when required

Referral bonuses

11 statutory holidays

Personal days

An active social life (5to7 events, social club, healthy snacks, coffee, and more)

Position Details 

Employment type: Full time, permanent

Notice on the Use of Artificial Intelligence in Recruitment

We use AI enabled tools to help sort and review applications based on job related criteria. Final decisions regarding candidate progression are always made by a human recruiter.

Employment Equity 

Levio subscribes to the principle of employment equity and applies an equal access employment program for women, Indigenous peoples, visible minorities, ethnic minorities, and persons with disabilities. 

We value diversity and inclusion and are committed to creating a healthy, accessible, and rewarding work environment that highlights the unique contributions of our employees. Accommodations are available upon request for candidates participating in all aspects of the selection process. 

Ready to apply?

Apply to Levio

Levio

View all jobs →

Senior Machine Learning Engineer, Growth

HelloFresh · Toronto, Ontario, Canada

Apply now

Tech Toronto, ON, Canada Posted May 7, 2026

We are seeking a Senior Machine Learning Engineer to join the Growth Tech Alliance. In this role, you will architect and deploy the robust infrastructure behind our intelligent marketing systems. You will be responsible for maturing algorithmic prototypes into high-performance production systems, ensuring our AI-driven marketing optimization is served reliably and autonomously at a global scale.

S'more about the team

We are hiring a Senior Machine Learning Engineer to take our AI tooling to the next level by architecting and deploying the robust infrastructure behind our intelligent marketing optimization systems. You will provide critical engineering execution for our AI initiatives. You will develop scalable microservices for predictive scoring, orchestrate complex LLM-based agents for creative intelligence. As the ML engineering expert for the team, you will drive the maturation of algorithmic prototypes into high-performance production systems with maximum Speed & Agility, shaping the future of how HelloFresh automates marketing at an unprecedented scale.

Lettuce share what this role will be responsible for

As a core member of the engineering team, you will focus on productionizing ML infrastructure across several domains:

Build robust integration layers for visual AI pipelines that process multi-modal embeddings that power various predictive models.
Transition proof-of-concept models into resilient production microservices and architect LLM-based orchestration frameworks.
Engineer high-throughput, low-latency data pipelines to process 1P data and pipeline signals into external platforms like Meta and Google.
Collaborate with data scientists and other engineers in a cross-functional team to improve HelloFresh’s value forecasting efficiency.
Establish CI/CD processes, feature stores, and drift detection to ensure continuous delivery and model reliability.
All other duties, as assigned

Sound a-peeling? Here's what we're looking for

Experience leading the end-to-end lifecycle of production ML systems, from architectural design to scalable deployment and monitoring.
Expertise in leveraging hyperscaler ecosystems (AWS, GCP, Azure) to build cost-effective, resilient, and automated ML infrastructure.
Deep technical proficiency with modern ML frameworks (PyTorch, TensorFlow, HuggingFace)
Expert programming skills in Python and PySpark.
A BS/MS in Computer Science or a related engineering field, coupled with a proven track record of bringing ML systems from prototype to high-traffic production.
Proven experience engineering robust data architectures that reliably process and combine diverse data formats, ranging from structured offline conversion data to unstructured multimedia assets.
A demonstrated bias for action and extreme ownership, eager to adopt Gen AI tools to creatively solve architectural challenges and enhance engineering velocity.

Let’s cut to the cheese, this is why you'll love it here

Box Discount - Amazing discounts on 1 box per week! 75% discount on weekly HelloFresh and Chefs Plate meal kits AND 50% off weekly Factor meal box.
Health & Wellness - Health & Dental benefits from day 1, a Health Spending Account, unlimited access to the Headspace app to meet your self-care needs, and 25% discount on GoodLife fitness memberships!
Vacation & PTO - Time off is also an important part of self-care! We offer generous vacation and PTO to help you create a good work-life balance.
Family Benefits - A parental leave top-up program for expectant parents.
Growth & Development - We support your career progression and invest in your continued learning through experiences and initiatives owned by our dedicated L&D team
Work Hard & Have Fun - From team socials to engaging company days, you’ll have plenty of opportunity to experience the fun!
Diversity & Inclusion Initiatives - With impactful ERG’s like FreshPride, Women Empowered and LIMES, we are committed to our diversity, equity & inclusion efforts.
Food Puns - this one is kind of a big dill if you haven’t already noticed. We even have some punny meeting room names!

Flexible Hybrid Approach

At HelloFresh, we know that flexible work arrangements are essential in enabling you to do your best work, while balancing your personal and life needs. Offering remote work flexibility, along with the opportunity to interact and collaborate in the office are all a part of creating a great employee experience.

To meet these needs, we are pleased to provide Flexible Hybrid work. Flexible Hybrid is a people-first approach that is based on choice, trust, personalization, and empowers teams to choose when and how often they work from the office and work from home, in addition to team days and company days. This means a minimum of 2 days in office per week, with most teams in office between 2-3 days a week.

#LI-HYBRID

#Engineering

HelloFresh Canada uses AI-integrated technology to help us process and evaluate applications more efficiently. This includes tools that screen and assess candidate qualifications based on the requirements for this role. While these tools assist our workflow, all final selection decisions are made by our hiring team.

This is a posting for an existing vacancy. We are actively seeking to fill this position.

Toronto, ON Pay Range

$170,000—$190,000 CAD

Ready to apply?

Apply to HelloFresh

HelloFresh

View all jobs →

Sr. Software Engineer, AI Compiler

Tenstorrent · Toronto, Ontario, Canada

Apply now

AI Compiler Toronto Posted May 6, 2026

Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify innovations in software models, compilers, platforms, networking, and semiconductors. Our diverse team of technologists have developed a high performance RISC-V CPU from scratch, and share a passion for AI and a deep desire to build the best AI platform possible. We value collaboration, curiosity, and a commitment to solving hard problems. We are growing our team and looking for contributors of all seniorities.

Join the team revolutionizing AI computing at Tenstorrent. You'll work on TT-Forge, our MLIR-based compiler that enables developers to run AI on all configurations of Tenstorrent hardware using an open-source, performant, and general-purpose compiler. You will be at the forefront of the AI hardware revolution, building compiler technologies that redefine what’s possible.

This role is hybrid and based out of Toronto, ON.

We welcome candidates at various experience levels for this role. During the interview process, candidates will be assessed for the appropriate level, and offers will align with that level, which may differ from the one in this posting.

Who You Are

A passionate software engineer eager to work on compiler technologies and the challenges of AI hardware, whether from compilers, systems, or broader software backgrounds.
Fluent in C++ and Python, with experience building complex systems that bridge high-level frameworks to low-level execution.
Excited by compiler optimization and machine learning, with experience in PyTorch, JAX, TensorFlow, or deep systems programming.
A collaborative problem-solver who thrives in open-source and enjoys working closely with hardware and software engineers.

What We Need

A drive to solve novel challenges in AI compilation, from optimizing computational graphs to creating custom dialects and transformation passes.
Experience or strong interest in MLIR and how modular compiler frameworks connect AI models to advanced hardware.
Motivation to build technology that impacts the future of AI, knowing your work will enable the next wave of breakthroughs.

What You Will Learn

How to build open-source compiler frameworks supporting diverse AI models and workloads, including training and multi-chip scaling.
Deep expertise in compiler technologies including custom MLIR dialects (TTIR, TTNN, TTKernel) and transformation passes.
New methods for human-in-the-loop compiler optimization using TT-Explorer, making advanced tuning tools usable by all developers.
How compiler technology powers Tenstorrent’s mission to deliver affordable, open-source AI computing in a highly competitive space.

Compensation for all engineers at Tenstorrent ranges from $100k - $500k including base and variable compensation targets. Experience, skills, education, background and location all impact the actual offer made.

Tenstorrent offers a highly competitive compensation package and benefits, and we are an equal opportunity employer.

This offer of employment is contingent upon the applicant being eligible to access U.S. export-controlled technology. Due to U.S. export laws, including those codified in the U.S. Export Administration Regulations (EAR), the Company is required to ensure compliance with these laws when transferring technology to nationals of certain countries (such as EAR Country Groups D:1, E1, and E2). These requirements apply to persons located in the U.S. and all countries outside the U.S. As the position offered will have direct and/or indirect access to information, systems, or technologies subject to these laws, the offer may be contingent upon your citizenship/permanent residency status or ability to obtain prior license approval from the U.S. Commerce Department or applicable federal agency. If employment is not possible due to U.S. export laws, any offer of employment will be rescinded.

Ready to apply?

Apply to Tenstorrent

Tenstorrent

View all jobs →

Sr. Engineer, Software - AI Compiler

Tenstorrent · Austin, Texas, United States; Santa Clara, California, United States; Toronto, Ontario, Canada

Apply now

AI Compiler Santa Clara Posted May 6, 2026

Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify innovations in software models, compilers, platforms, networking, and semiconductors. Our diverse team of technologists have developed a high performance RISC-V CPU from scratch, and share a passion for AI and a deep desire to build the best AI platform possible. We value collaboration, curiosity, and a commitment to solving hard problems. We are growing our team and looking for contributors of all seniorities.

Join the team revolutionizing AI computing at Tenstorrent. You'll work on TT-Forge, our MLIR-based compiler that enables developers to run AI on all configurations of Tenstorrent hardware using an open-source, performant, and general-purpose compiler. You will be at the forefront of the AI hardware revolution, building compiler technologies that redefine what’s possible.

This role is hybrid, and can be based out of Santa Clara, CA; Austin, TX; or Toronto; ON.

We welcome candidates at various experience levels for this role. During the interview process, candidates will be assessed for the appropriate level, and offers will align with that level, which may differ from the one in this posting.

Who You Are

A passionate software engineer eager to work on compiler technologies and the challenges of AI hardware, whether from compilers, systems, or broader software backgrounds.
Fluent in C++ and Python, with experience building complex systems that bridge high-level frameworks to low-level execution.
Excited by compiler optimization and machine learning, with experience in PyTorch, JAX, TensorFlow, or deep systems programming.
A collaborative problem-solver who thrives in open-source and enjoys working closely with hardware and software engineers.

What We Need

A drive to solve novel challenges in AI compilation, from optimizing computational graphs to creating custom dialects and transformation passes.
Experience or strong interest in MLIR and how modular compiler frameworks connect AI models to advanced hardware.
Motivation to build technology that impacts the future of AI, knowing your work will enable the next wave of breakthroughs.

What You Will Learn

How to build open-source compiler frameworks supporting diverse AI models and workloads, including training and multi-chip scaling.
Deep expertise in compiler technologies including custom MLIR dialects (TTIR, TTNN, TTKernel) and transformation passes.
New methods for human-in-the-loop compiler optimization using TT-Explorer, making advanced tuning tools usable by all developers.
How compiler technology powers Tenstorrent’s mission to deliver affordable, open-source AI computing in a highly competitive space.

Compensation for all engineers at Tenstorrent ranges from $100k - $500k including base and variable compensation targets. Experience, skills, education, background and location all impact the actual offer made.

Tenstorrent offers a highly competitive compensation package and benefits, and we are an equal opportunity employer.

This offer of employment is contingent upon the applicant being eligible to access U.S. export-controlled technology. Due to U.S. export laws, including those codified in the U.S. Export Administration Regulations (EAR), the Company is required to ensure compliance with these laws when transferring technology to nationals of certain countries (such as EAR Country Groups D:1, E1, and E2). These requirements apply to persons located in the U.S. and all countries outside the U.S. As the position offered will have direct and/or indirect access to information, systems, or technologies subject to these laws, the offer may be contingent upon your citizenship/permanent residency status or ability to obtain prior license approval from the U.S. Commerce Department or applicable federal agency. If employment is not possible due to U.S. export laws, any offer of employment will be rescinded.

Ready to apply?

Apply to Tenstorrent

Tenstorrent

View all jobs →

Data Scientist - Decisions, Mapping

Lyft · Toronto, Canada

Apply now

Mapping Toronto Coworking Posted May 5, 2026

At Lyft, our purpose is to serve and connect. We aim to achieve this by cultivating a work environment where all team members belong and have the opportunity to thrive.

As a Data Scientist on the Mapping team, you will collaborate with our world class team of engineers, product managers, and designers to grow and improve the quality of recommended routes and accuracy of our travel time estimations. We're looking for a passionate, driven Data Scientist who is excited to dive into our spatial data and build a best-in-class mapping product that provides safe, efficient, and seamless navigation for our rideshare drivers.

Data Science is at the heart of Lyft’s products and decision-making. You will leverage data and rigorous, analytical thinking to shape our mapping products and make business decisions that put our customers first. This will involve identifying and scoping opportunities, shaping priorities, recommending technical solutions, designing experiments, and measuring the impact of new features. You will help us solve some of the most impactful problems in mapping, including:

How do we improve the quality of our map data in order to improve our recommendations?
How do we benchmark and measure the success of our services?
How do we validate features of the real world that affect our routing algorithms?
Are we meeting our travel estimation promises to our customers?

Responsibilities

Leverage data and analytic frameworks to identify opportunities for growth and efficiency
Partner with product managers, engineers, and operators to translate analytical insights into decisions and action, and implement products to drive business goals
Define and implement decision frameworks, measurement strategies, and scientific methodologies that bring consistency and rigor to business decisions and forecasts, balancing opportunity and uncertainty
Deliver integrated, high-quality analytical outputs spanning multiple projects while navigating ambiguity, cross-team dependencies, and open-ended scope
Design and analyze online experiments; communicate results and act on launch decisions
Establish metrics that measure the health of our products, as well as rider and driver experience
Identify and drive impact and alignment, shaping product and business strategy through data-centric presentations
Contribute to the Science community (hiring, onboarding, documentation, knowledge-sharing, tooling improvements), helping make Decision Science at Lyft more effective and scalable

Experiences

Degree in a quantitative field such as statistics, economics, applied math, operations research or engineering (advanced degrees preferred), or relevant work experience
3+ years experience in a data science role or analytics role
Demonstrated ability to own multi-project analytical scopes with ambiguous problem definitions and cross-functional integration
End-to-end experience with data, including querying, aggregation, analysis, and visualization
Proficiency in SQL - able to write structured and efficient queries on large data sets
Experience in programming, especially with data science and visualization libraries in Python or R, and machine learning libraries such as PyTorch, TensorFlow, Keras
Experience in online experimentation and statistical analysis, and communicating results and recommendations to senior stakeholders
Strong communication, critical thinking, and prioritization skills, including the ability to challenge assumptions, propose alternatives, and balance short-term vs. long-term tradeoffs
Experience in applying machine learning techniques is a plus (e.g. reinforcement learning) to solve customer problems (e.g. personalization, segmentation)
Expertise in metric design, causal analysis, behavioral analytics, decision frameworks and measurement strategy is a plus
Experience working with ETL pipelines a plus

Benefits:

Extended health and dental coverage options, along with life insurance and disability benefits
Mental health benefits
Family building benefits
Child care and pet benefits
Access to a Lyft funded Health Care Savings Account
RRSP plan with company match to help save for your future
In addition to provincial observed holidays, salaried team members are covered under Lyft's flexible paid time off policy. The policy allows team members to take off as much time as they need (with manager approval). Hourly team members get 15 days paid time off, with an additional day for each year of service
Lyft is proud to support new parents with 18 weeks of paid time off, designed as a top-up plan to complement provincial programs. Biological, adoptive, and foster parents are all eligible.
Subsidized commuter benefits and Lyft ride credits

Lyft is committed to creating an inclusive workforce that fosters belonging. Lyft believes that every person has a right to equal employment opportunities without discrimination because of race, ancestry, place of origin, colour, ethnic origin, citizenship, creed, sex, sexual orientation, gender identity, gender expression, age, marital status, family status, disability, pardoned record of offences, or any other basis protected by applicable law or by Company policy. Lyft also strives for a healthy and safe workplace and strictly prohibits harassment of any kind. Accommodation for persons with disabilities will be provided upon request in accordance with applicable law during the application and hiring process. Please contact your recruiter if you wish to make such a request.

Lyft highly values having employees working in-office to foster a collaborative work environment and company culture. This role will be in-office on a hybrid schedule — Team Members will be expected to work in the office at least 3 days per week, including on Mondays, Wednesdays, and Thursdays. Lyft considers working in the office at least 3 days per week to be an essential function of this hybrid role. Your recruiter can share more information about the various in-office perks Lyft offers. Additionally, hybrid roles have the flexibility to work from anywhere for up to 4 weeks per year. #Hybrid

The expected base pay range for this position in the Toronto area is $108,000 - $135,000, not inclusive of potential equity offering, bonus or benefits. Salary ranges are dependent on a variety of factors, including qualifications, experience and geographic location. Your recruiter can share more information about the salary range specific to your working location and other factors during the hiring process.

Lyft may use artificial intelligence to screen applicants, however, Lyft employees make the ultimate selection and hiring decisions.

This job fills an existing vacancy.

Ready to apply?

Apply to Lyft

Lyft

View all jobs →

Sr AI Research Scientist, AI Evaluation and Reliability

Upwork · Toronto, Ontario, Canada

Apply now

Machine Learning & Artificial Intelligence Toronto, Canada Posted May 5, 2026

Upwork Inc.’s (Nasdaq: UPWK) family of companies connects businesses with global, AI-enabled talent across every contingent work type including freelance, fractional, and payrolled. This portfolio includes the Upwork Marketplace, which connects businesses with on-demand access to highly skilled talent across the globe, and Lifted, which provides a purpose-built solution for enterprise organizations to source, contract, manage, and pay talent across the full spectrum of contingent work. From Fortune 100 enterprises to entrepreneurs, businesses rely on Upwork Inc. to find and hire expert talent, leverage AI-powered work solutions, and drive business transformation. With access to professionals spanning more than 10,000 skills across AI & machine learning, software development, sales & marketing, customer support, finance & accounting, and more, the Upwork family of companies enables businesses of all sizes to scale, innovate, and transform their workforces for the age of AI and beyond.

Since its founding, Upwork Inc. has facilitated more than $30 billion in total transactions and services as it fulfills its purpose to create opportunity in every era of work. Learn more about the Upwork Marketplace at Upwork.com and follow us on LinkedIn, Facebook, Instagram, TikTok, and X; and learn more about Lifted at Go-Lifted and follow on LinkedIn.

Sr. Lead AI Research Scientist, AI Evaluation and Reliability

The AI Foundations team leads core research and development across the training, evaluation, and deployment of AI systems that power Uma, Upwork’s flagship AI model, and other customer-facing generative AI capabilities. As a Sr. Lead AI Research Scientist focused on AI Evaluation and Reliability, you will drive high-impact research initiatives that improve the trustworthiness, robustness, and real-world performance of AI systems operating at marketplace scale.

At the Sr. Lead level, this role combines deep technical expertise with cross-functional leadership. You will identify and lead research efforts that address systemic reliability challenges, partner closely with engineering and product teams to translate research into production outcomes, and help shape how Upwork evaluates AI performance in real work scenarios. Your work will support AI systems embedded in retrieval-based workflows, agentic architectures, and human plus AI collaboration patterns, while contributing to Upwork’s broader AI research strategy and external presence.

Responsibilities:

Lead applied research initiatives focused on AI evaluation, reliability, and robustness, defining success metrics tied to customer impact and production readiness.
Design and validate methods to measure and mitigate AI reliability risks, including uncertainty estimation, hallucination detection, and identification of model failure modes.
Partner cross-functionally with engineering, data science, and product teams to integrate research outcomes into customer-facing AI systems and workflows.
Own research projects end to end, from problem framing and hypothesis development through experimentation, prototyping, and synthesis of results.
Influence technical direction across teams by surfacing insights, proposing scalable solutions, and aligning stakeholders on priorities and tradeoffs.
Mentor researchers and engineers through technical guidance, feedback, and collaborative leadership on shared initiatives.
Contribute to Upwork’s external research footprint through publications, presentations, and engagement with the broader AI research community.

What it takes to catch our eye:

Proven experience leading applied AI research that balances scientific rigor with real-world deployment constraints and business impact.
A strong record of research contribution through publications, internal innovation, or demonstrable influence on production AI systems.
Deep proficiency with Python and modern deep learning frameworks such as PyTorch, with hands-on experience evaluating and improving large-scale models.
An adaptive approach to integrating AI tools into research and development workflows to accelerate experimentation, improve evaluation quality, and share best practices with others.
A collaborative, growth-oriented mindset with the ability to mentor peers, communicate complex ideas clearly, and thrive in a fast-evolving, bottom-up environment.

Come change how the world works.

Upwork is establishing an operational hub in Toronto, Canada. The new office is expected to be fully operational by Q4 2026. This role will require 3 days in office once we have an office open.

This position will initially be employed through a partner to ensure a seamless hiring process while we establish the hub. Once the hub is established, there may be opportunities to transition to employment with Upwork, depending on business needs and other requirements. While employed by the partner, you’ll work as part of Upwork’s team, with access to our resources, culture, and growth opportunities.

Our partner will offer competitive benefits. When Upwork’s hub is established, we will be excited to offer employment and benefits directly as business needs require.

Upwork is committed to building a diverse, inclusive, and equitable workforce. Employment decisions are made without regard to race, color, religion, gender, sexual orientation, gender identity, national origin, disability, or any other status protected by applicable law.

We use BrightHire, an AI-enabled tool, to record interviews and summarize interview transcripts. The tool allows the interviewer to focus on the discussion and does not score or evaluate candidates or make recommendations. The interview transcripts are reviewed, and decisions are only made by humans. Candidates who prefer not to have their interview recorded through BrightHire can opt out when the interview is scheduled.

To learn more about how Upwork processes and protects your personal information as part of the application process, please review our Global Job Applicant Privacy Notice and the Applicant Privacy Addendum (Canada).

To learn more about how Upwork processes and protects your personal information as part of the application process, please review our Global Job Applicant Privacy Notice

Ready to apply?

Apply to Upwork

Upwork

View all jobs →

Principal Data Scientist (m/f/d)

AutoTrader.ca · Toronto, ON (Canada)

Apply now

Data & Analytics AutoTrader Posted May 5, 2026

We are a Canadian leader in digital automotive solutions. Our flagship brands — AutoTrader.ca, AutoSync, Dealertrack Canada and CMS — help Canadians buy, sell, and finance vehicles with confidence.

AutoTrader.ca is Canada’s largest automotive marketplace, with over 25 million monthly visits.

As part of AutoScout24 group, Europe’s largest online car marketplace, we’re shaping the future of automotive retail in Canada and beyond.

Join our global Data Science team and take a leadership role in shaping the future of the automotive marketplace through cutting-edge AI and machine learning. As a Principal Data Scientist, you’ll spearhead high-impact AI initiatives that influence millions of users worldwide, with a focus on both strategic vision and hands-on innovation.

In this role, you’ll collaborate cross-functionally with product, engineering, and business teams to design, build, and scale AI solutions that set us apart in the industry. You’ll bring deep technical expertise, a passion for innovation, and a strong product mindset to develop ML products that solve real-world problems and deliver measurable business value.

Our ideal candidate is an experienced data science leader who thrives in a dynamic, fast-paced environment that combines the stability of an industry leader with the agility of a startup culture. You are curious, proactive, and committed to continuous learning, especially in emerging areas like Generative AI and Large Language Models (LLMs).

You should be comfortable engaging with all levels of the organization, from peers to executives, and possess the ability to distill complex information into clear, actionable strategies that drive business decisions and create value.

What You’ll Do:

Shape the strategic direction and vision of Data Science across the organization by identifying transformative AI/ML opportunities and championing the team’s evolving role in a rapidly advancing GenAI landscape.
Lead the design and deployment of predictive and generative AI models that power personalization, pricing, search, and optimization in marketplace and fintech domains.
Collaborate with product leaders to align ML initiatives with strategic business goals and drive product innovation through data-driven experimentation and modeling.
Architect scalable ML infrastructure and automated workflows using cloud-native tools (e.g., AWS, EC2, Kubernetes) to support efficient model training, deployment, and analytics across diverse datasets.
Ensure long-term performance and compliance of production AI models though robust governance and monitoring.
Provide technical leadership and mentorship to data scientists, shaping an innovative team culture and fostering high-performance and continuous learning.
Serve as a cross-functional technical leader, shaping company-wide technical initiatives beyond data science. Partner with engineering, product, and platform teams to influence architecture, innovation agendas, and technical standards across the organization.
Act as a senior technical advisor, leading resolution of complex modeling issues and acting as the escalation point for critical incidents.
Lead the exploration and strategic application of GenAI, identifying high-impact use cases and guiding their integration across products and platforms.

What You’ll Need:

Advanced academic credentials in a quantitative field such as Computer Science, Engineering, Mathematics, or related discipline
10+ years of experience in data science, machine learning, or applied AI, with a strong portfolio of high-impact projects in production
Expert-level programming skills in Python and SQL, and fluency with leading ML/AI frameworks (e.g., scikit-learn, TensorFlow, PyTorch)
Direct experience with GenAI/LLM technologies, including tools like Hugging Face, LangChain, OpenAI APIs, vector databases, and fine-tuning methods
Deep knowledge of machine learning algorithms (supervised, unsupervised, deep learning), including model evaluation, explainability, and selection for business-critical use cases
Strong hands-on experience with cloud infrastructure (AWS), containerization (Docker), and orchestration (Jenkins, Airflow)
Proven capability in MLOps, including CI/CD pipelines, model monitoring, versioning, and automated retraining
Experience deploying and serving models through APIs (e.g., Flask, FastAPI) in both real-time and batch-processing environments
Excellent communication and stakeholder management skills, able to translate complex concepts into actionable insights for non-technical audiences
Demonstrated success mentoring teams, guiding technical strategy, and advocating for best practices in experimentation, reproducibility, and ethical AI
Experience working in agile product development environments (Scrum/Kanban); experience influencing product roadmaps is a strong plus

Bonus Points

Experience in e-commerce, marketplaces, or high-scale consumer platforms
Familiarity with automotive data and applications in pricing, inventory optimization, or recommendation systems
Contributions to open-source AI/ML projects, publications, or presentations at industry conferences

Experience leveraging AI, Generative AI (GenAI) to enhance engineering productivity, automate repetitive tasks, and optimize workflows. Candidates should demonstrate the ability to integrate AI-driven solutions into their daily work — such as code generation, debugging, reviews, documentation, and decision support—to improve efficiency for themselves and their teams. A proactive approach to exploring and implementing AI tools that drive innovation and streamline development processes is highly valued

If this opportunity excites you, but you're unsure whether your background checks every box, we value passion, curiosity, and a growth mindset. Reach out and tell us what strengths you would bring to the team. We look forward to hearing from you!

The base salary range for this position is CAD 180K – CAD 220K.

This range reflects the expected compensation at the time of posting. The final offer may vary and can be higher based on relevant skills, experience, location, and market conditions. Based on the role the total rewards package may also include benefits, bonus, and other employee offerings.

What's in it for you:

We understand that there is life at work and life outside of work. Here are a few benefits we all benefit from that support us to be our creative best.

Gym discounts

Employee and Family Assistance program

Virtual wellness events

Conferences & training budget

Regular internal training programs

Financial planning with 3% matching Pension

Competitive salary
Annual bonus structure

For a career where you can drive our business and shape your future, apply now.

Use of Artificial Intelligence in Hiring: We use artificial intelligence (“AI”) in our hiring process, including to screen, assess, or select applicants for this position.

Vacancy Status: This job posting is for an existing vacancy.

Ready to apply?

Apply to AutoTrader.ca

AutoTrader.ca

View all jobs →

Machine Learning Engineer (Staff & Principal)

Tubi · San Francisco, CA; Los Angeles, CA; New York, NY (Hybrid)

Apply now

Engineering San Francisco (HQ) Toronto Posted May 1, 2026

About the Role:

The Machine Learning team at Tubi drives the innovation behind personalized user experiences. With the largest inventory in the industry and hundreds of millions of viewers, we tackle problems in the space of recommendations, search, content understanding and ads optimization that shape the future of streaming.

We are seeking a highly skilled Machine Learning Engineer to contribute to transformative projects in video personalization. In this role, you will design and implement advanced algorithms and systems to improve our personalization strategy. As a senior technical expert, you will tackle complex problems in machine learning at scale, collaborating closely with cross-functional teams to develop and optimize machine learning-driven solutions.

What You'll Do:

Lead the design, development, and implementation of advanced recommendation systems and algorithms for a global audience
Conduct deep dives into algorithmic components and systems, ensuring that models are optimized for both performance and scalability across multiple regions and product areas
Build and deploy high-impact robust ML pipelines, including data extraction, feature development, model training, testing, and deployment
Continuously monitor, evaluate, and optimize the performance of deployed models, ensuring they meet business goals and provide high-quality user experiences.
Work closely with Product, Engineering, and Data Science teams to align on product requirements, set expectations, and deliver machine learning-driven solutions that improve user engagement

Your Background:

8+ years of industry experience building production Machine Learning systems
MSc or Ph.D. in Computer Science, Machine Learning, Statistics, Mathematics, or a related field
Experience with deep learning technologies for recommendation systems, including TensorFlow, PyTorch, or similar frameworks
Proficiency in building and deploying full-stack machine learning pipelines: data extraction, data mining, model training, feature development, testing, and deployment.
Solid understanding of statistical concepts such as hypothesis testing, regression analysis, and performance evaluation metrics for machine learning.
Ability to deep dive into individual components and systems, as well as understand the overall architecture of machine learning solutions.

#LI-Hybrid #LI-SC1

Pursuant to state and local pay disclosure requirements, the pay range for this role, with final offer amount dependent on education, skills, experience, and location is is listed annually below. This role is also eligible for an annual discretionary bonus, long-term incentive plan, and various benefits including medical/dental/vision, insurance, a 401(k) plan, paid time off and other benefits in accordance with applicable plan documents.

High cost labor markets such as but not limited to Los Angeles, New York City, and San Francisco

Staff Level

$239,000—$342,000 USD

Principal Level

$292,000—$417,000 USD

Tubi is a division of Fox Corporation, and the FOX Employee Benefits summarized here, covers the majority of all US employee benefits. The following distinctions below outline the differences between the Tubi and FOX benefits:

For US-based non-exempt Tubi employees, the FOX Employee Benefits summary accurately captures the Vacation and Sick Time.
For all salaried/exempt employees, in lieu of the FOX Vacation policy, Tubi offers a Flexible Time off Policy to manage all personal matters.
For all full-time, regular employees, in lieu of FOX Paid Parental Leave, Tubi offers a generous Parental Leave Program, which allows parents twelve (12) weeks of paid bonding leave within the first year of birth, adoption, surrogacy, or foster placement of a child in addition to applicable government leave program(s) and FOX’s short-term disability policy. This time is 100% paid through a combination of any applicable state, city, and federal leaves and wage-replacement programs in addition to contributions made by Tubi.
For all full-time, regular employees, Tubi offers a monthly wellness reimbursement.

About Tubi:

Boldly built for every fandom, Tubi is a free streaming service that entertains over 100 million monthly active users. Tubi offers the world's largest collection of Hollywood movies and TV shows, thousands of creator-led stories and hundreds of Tubi Originals made for the most passionate fans. Headquartered in San Francisco and founded in 2014, Tubi is part of Tubi Media Group, a division of Fox Corporation.

We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, gender identity, disability, protected veteran status, or any other characteristic protected by law. We will consider for employment qualified applicants with criminal histories consistent with applicable law.

Ready to apply?

Apply to Tubi

Tubi

View all jobs →

Director, ML Engineering & Infrastructure

Tubi · San Francisco, CA; Los Angeles, CA; New York, NY (Hybrid); USA - Remote

Apply now

Engineering Toronto US Posted May 1, 2026

About the Role:

The Machine Learning team at Tubi drives the innovation behind personalized user experiences. With the largest inventory in the industry and hundreds of millions of viewers, we tackle problems in the space of recommendations, search, content understanding, and ads optimization that shape the future of streaming.

We are seeking a Director of Machine Learning Engineering and Infrastructure to lead a hybrid team bridging advanced ML engineering with world-class infrastructure design. In this role, you will own the strategic direction and execution for scaling our machine learning capabilities while ensuring our distributed systems and infrastructure can support innovation at massive scale. You will combine technical depth with leadership excellence to guide teams that deliver both foundational ML systems and high-performance distributed services.

What You'll Do:

Lead and manage high-performing teams across ML engineering and ML infrastructure, fostering a culture of innovation, collaboration, and growth.
Define and execute the strategic roadmap for ML systems, including recommendation, personalization, and ads optimization.
Oversee the design, development, and deployment of scalable ML pipelines: data ingestion, feature engineering, model training, evaluation, and serving.
Architect distributed systems to support ML workloads at scale, ensuring reliability, observability, and operational excellence.
Partner closely with Product, Engineering, and Content teams to align on business goals and deliver impactful ML-driven experiences.
Support best practices in experimentation, evaluation, and ML system monitoring.
Ensure cost efficiency, scalability, and performance in ML infrastructure investments.

Your Background:

10+ years of industry experience spanning machine learning engineering and distributed systems.
3+ years of leadership and management experience, with a proven ability to build and lead strong technical teams.
MSc or Ph.D. in Computer Science, Machine Learning, or related field, or equivalent practical experience.
Proven expertise in building and deploying end-to-end ML systems at scale, including recommendation and personalization systems.
Strong background in distributed systems architecture, including low-latency services, streaming platforms, and large-scale serving.
Hands-on experience with deep learning frameworks (e.g., TensorFlow, PyTorch) and ML infrastructure technologies.
Track record of delivering high-quality, scalable, and fault-tolerant systems.
Excellent communication skills and ability to influence product and technical strategy.
Proven experience deploying large-scale serving systems on AWS and demonstrated expertise in leveraging Databricks for large-scale data processing and ML workflows

Pursuant to state and local pay disclosure requirements, the pay range for this role, with final offer amount dependent on education, skills, experience, and location is listed annually below. This role is also eligible for an annual discretionary bonus, long-term incentive plan, and various benefits including medical/dental/vision, insurance, a 401(k) plan, paid time off and other benefits in accordance with applicable plan documents.

High cost labor markets such as but not limited to Los Angeles, New York City, and San Francisco

$292,000—$417,200 USD

Tubi is a division of Fox Corporation, and the FOX Employee Benefits summarized here, covers the majority of all US employee benefits. The following distinctions below outline the differences between the Tubi and FOX benefits:

For US-based non-exempt Tubi employees, the FOX Employee Benefits summary accurately captures the Vacation and Sick Time.
For all salaried/exempt employees, in lieu of the FOX Vacation policy, Tubi offers a Flexible Time off Policy to manage all personal matters.
For all full-time, regular employees, in lieu of FOX Paid Parental Leave, Tubi offers a generous Parental Leave Program, which allows parents twelve (12) weeks of paid bonding leave within the first year of birth, adoption, surrogacy, or foster placement of a child in addition to applicable government leave program(s) and FOX’s short-term disability policy. This time is 100% paid through a combination of any applicable state, city, and federal leaves and wage-replacement programs in addition to contributions made by Tubi.
For all full-time, regular employees, Tubi offers a monthly wellness reimbursement.

About Tubi:

Boldly built for every fandom, Tubi is a free streaming service that entertains over 100 million monthly active users. Tubi offers the world's largest collection of Hollywood movies and TV shows, thousands of creator-led stories and hundreds of Tubi Originals made for the most passionate fans. Headquartered in San Francisco and founded in 2014, Tubi is part of Tubi Media Group, a division of Fox Corporation.

We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, gender identity, disability, protected veteran status, or any other characteristic protected by law. We will consider for employment qualified applicants with criminal histories consistent with applicable law.

Ready to apply?

Apply to Tubi

Tubi

View all jobs →

Senior Applied Scientist, ML Predictive Maintenance (Asset Intelligence)

MaintainX · Montreal, Toronto, Vancouver

Apply now

Engineering Montreal Posted Apr 24, 2026

MaintainX is the world’s leading mobile-first Asset and Work Intelligence platform for industrial and frontline environments. We’re a modern, IoT-enabled, cloud-based solution that powers maintenance, safety, and operations on physical equipment and facilities.

We help 12,000+ organizations—including Duracell, Univar Solutions, Titan America, McDonald’s, Brenntag, Cintas, Xylem, and Shell—achieve operational excellence and reliability at scale.

Following our $150 million Series D led by Bain Capital Ventures, Bessemer Ventures, August Capital, Amity Ventures, and Ridge Ventures, MaintainX has raised a total of $254 million, valuing the company at $2.5 billion.

As we enter our next phase of growth, we’re investing deeply in AI/ML, LLMs, and Industrial IoT to transform how frontline teams operate—predicting failures before they happen, automating workflows, and embedding intelligence into every asset and procedure.

What you’ll do:

Design, develop and optimize machine learning models for fault detection and classification end-to-end e.g. data and training modeling choices to evaluation strategies and production constraints.
Perform EDA on vibration, OT and time-series data to uncover insights and identify patterns indicative of faults or anomalies.
Conduct experiments and evaluation of various algorithms on time-series modeling, signal processing, and statistical methods, to optimize model performance.
Partner with PMs in product feature discovery and roadmap prioritization through validating product hypotheses, designing success metrics and quantifying end user impact
Collaborate with domain experts to validate findings and ensure alignment with real-world applications.
Engage with your community of peers to challenge the status quo, improve our shared ways of working, and influence overall architecture decisions, continuing to foster our culture of Applied Science excellence
On-call duties

About you:

Master’s or Ph.D. in Computer Science, Data Science, Mechanical Engineering, Electrical Engineering, or a related field with a focus on condition monitoring or machine learning applications.
5+ years of proven programming skills using standard ML tools such as Python, PyTorch, Tensorflow etc.
Strong foundational knowledge in machine learning, data science, and statistics
Familiarity with time-series modeling techniques and feature engineering.
Ability to deliver production-grade code that is well-tested, maintainable, and evaluated through rigorous experimentation.

Bonus skills:

Hands-on experience developing models for OT and vibration analysis, condition monitoring, and fault detection or classification.
Familiarity with signal processing techniques (e.g., Fourier transforms, wavelet analysis) and their application to OT and vibration data.

What’s in it for you:

Competitive salary and meaningful equity opportunities.
Healthcare, dental, and vision coverage.
401(k) / RRSP enrollment program.
Take what you need PTO.
A high impact Culture:

You’ll work with Smart, Humble Optimists across the globe.
Meritocratic environment where ideas and outcomes are publicly celebrated.

About us:

We exist to make the lives of frontline and maintenance teams easier by building software that meets their real-world needs. Our product transforms how 80% of the global workforce—those who don’t sit behind a desk—manage their operations, assets, and teams.

MaintainX is committed to creating a diverse environment. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status.

Ready to apply?

MaintainX

View all jobs →

Senior Applied Machine Learning Developer, Asset Intelligence

MaintainX · Montreal, Toronto, Raleigh, San Francisco, Miami

Apply now

Engineering Canada Posted Apr 24, 2026

MaintainX is the world’s leading mobile-first Asset and Work Intelligence platform for industrial and frontline environments. We’re a modern, IoT-enabled, cloud-based solution that powers maintenance, safety, and operations on physical equipment and facilities.

We help 12,000+ organizations—including Duracell, Univar Solutions, Titan America, McDonald’s, Brenntag, Cintas, Xylem, and Shell—achieve operational excellence and reliability at scale.

Following our $150 million Series D led by Bain Capital Ventures, Bessemer Ventures, August Capital, Amity Ventures, and Ridge Ventures, MaintainX has raised a total of $254 million, valuing the company at $2.5 billion.

As we enter our next phase of growth, we’re investing deeply in AI/ML, LLMs, and Industrial IoT to transform how frontline teams operate—predicting failures before they happen, automating workflows, and embedding intelligence into every asset and procedure.

The Role

We are seeking a highly skilled and motivated Senior Applied Machine Learning Developer to guide the technical direction and architecture of our Predictive Maintenance and Asset Intelligence initiatives.

You’ll combine deep ML expertise with strong software development and leadership skills—mentoring developers, scaling systems, and driving the roadmap for AI-enabled maintenance intelligence across thousands of industrial sites.

This role sits at the intersection of ML architecture, IoT data systems, and product impact, shaping the foundation for MaintainX’s predictive and generative AI strategy.

What you’ll do:

Lead technical direction for predictive maintenance, anomaly detection, and LLM-powered intelligence across MaintainX products.
Architect end-to-end ML systems—from data ingestion and feature development to model training, deployment, and monitoring.
Mentor a growing team of ML and data developers, instilling best practices for experimentation, evaluation, and model lifecycle management.
Partner with product and software development leaders to align AI roadmap with customer needs and business goals.
Design reliable data and feedback loops that connect customer telemetry and operator feedback to model retraining.
Drive performance optimization through techniques like quantization, distillation, and scalable inference serving.
Work with LLM frameworks (LangChain, LlamaIndex, Hugging Face) to build reasoning systems and agentic workflows for asset and work intelligence.
Ensure ML infrastructure meets production standards for latency, reliability, explainability, and security.

About you:

7+ years of experience in Machine Learning, Data Science, or Applied AI.
Expertise in Python, and strong familiarity with PyTorch, TensorFlow, and cloud ML stacks (AWS, Databricks, or similar).
Proven experience deploying production ML systems—not just prototypes—at scale.
Strong background in LLMs, time-series modeling, and anomaly detection for real-world data.
Demonstrated ability to lead architectural decisions, mentor developers, and collaborate across product, data, and platform teams.
Knowledge of MLOps tooling (Docker, Kubernetes, Weights & Biases, MLflow, SageMaker).
Advanced degree (MS/PhD) in Computer Science, Machine Learning, or related field preferred.

Bonus skills:

Experience with OCR for extracting structured data from documents.
Background in time-series modeling for predictive maintenance and anomaly detection.
Familiarity with Industrial IoT systems (sensors, telemetry, edge computing).
Experience applying reinforcement learning or agentic architectures for decision-making and control systems.
Contributions to open-source ML frameworks or research in reliability, explainability, or digital twins.

What’s in it for you:

Competitive salary and meaningful equity opportunities.
Healthcare, dental, and vision coverage.
401(k) / RRSP enrollment program.
Take what you need PTO.
A high impact Culture:

You’ll work with Smart, Humble Optimists across the globe.
Meritocratic environment where ideas and outcomes are publicly celebrated.

About us:

We exist to make the lives of frontline and maintenance teams easier by building software that meets their real-world needs. Our product transforms how 80% of the global workforce—those who don’t sit behind a desk—manage their operations, assets, and teams.

MaintainX is committed to creating a diverse environment. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, gender identity or expression, sexual orientation, national origin, genetics, disability, age, or veteran status.

Ready to apply?

MaintainX

View all jobs →

Développeur(se) senior en Apprentissage automatique, Intelligence des actifs

MaintainX · Montreal, Toronto, Raleigh, San Francisco, Miami

Apply now

Engineering Canada Posted Apr 24, 2026

MaintainX est la première plateforme mobile au monde dédiée à la gestion des actifs et des tâches dans les environnements industriels et de première ligne. Nous proposons une solution moderne, basée sur l'IoT et le cloud, qui facilite la maintenance, la sécurité et l'exploitation des équipements et installations physiques.

Nous aidons plus de 12 000 organisations, dont Duracell, Univar Solutions, Titan America, McDonald's, Brenntag, Cintas, Xylem et Shell, à atteindre l'excellence opérationnelle et la fiabilité à grande échelle.

À la suite de notre série D de 150 millions de dollars menée par Bain Capital Ventures, Bessemer Ventures, August Capital, Amity Ventures et Ridge Ventures, MaintainX a levé un total de 254 millions de dollars, valorisant la société à 2,5 milliards de dollars.

Alors que nous entrons dans notre prochaine phase de croissance, nous investissons massivement dans l'IA/ML, les LLM et l'IoT industriel afin de transformer le mode de fonctionnement des équipes de première ligne, en prédisant les pannes avant qu'elles ne se produisent, en automatisant les flux de travail et en intégrant l'intelligence dans chaque actif et chaque procédure.

Le poste

Nous recherchons un développeur(se) senior en apprentissage automatique appliqué hautement qualifié et motivé pour orienter la direction technique et l'architecture de nos initiatives en matière de maintenance prédictive et d'intelligence des actifs.

Vous combinerez une expertise approfondie en apprentissage automatique avec de solides compétences en génie logiciel et en sens du leadership. Vous encadrerez des ingénieurs, développerez des systèmes et piloterez la feuille de route pour l'intelligence de maintenance basée sur l'IA sur des milliers de sites industriels.

Ce poste se situe à la croisée de l'architecture d'apprentissage automatique, des systèmes de données IoT et de l'impact des produits, et constitue le fondement de la stratégie d'IA prédictive et générative de MaintainX.

Ce que vous ferez:

Diriger la direction technique pour la maintenance prédictive, la détection des anomalies et l'intelligence alimentée par LLM dans tous les produits MaintainX.
Concevoir des systèmes ML de bout en bout, de l'ingestion des données et le développement des fonctionnalités à la formation, au déploiement et à la surveillance des modèles.
Encadrer une équipe croissante de développeurs(ses) ML et de données, en inculquant les meilleures pratiques en matière d'expérimentation, d'évaluation et de gestion du cycle de vie des modèles.
Collaborer avec les responsables des produits et de développement afin d'aligner la feuille de route de l'IA sur les besoins des clients et les objectifs commerciaux.
Concevoir des boucles de données et de rétroaction fiables qui relient la télémétrie des clients et les commentaires des opérateurs au réentraînement des modèles.
Optimiser les performances grâce à des techniques telles que la quantification, la distillation et le service d'inférence évolutif.
Travailler avec des cadriciels LLM (LangChain, LlamaIndex, Hugging Face) pour créer des systèmes de raisonnement et des flux de travail agentifs pour l'intelligence des actifs et du travail.
S'assurer que l'infrastructure ML répond aux normes de production en matière de latence, de fiabilité, d'explicabilité et de sécurité.

À propos de vous:

Plus de 7 ans d'expérience dans le domaine de l'apprentissage automatique, de la science des données ou de l'IA appliquée.
Expertise en Python et excellente maîtrise de PyTorch, TensorFlow et des piles ML cloud (AWS, Databricks ou similaires).
Expérience avérée dans le déploiement à grande échelle de systèmes ML de production, et pas seulement de prototypes.
Solide expérience en LLM, modélisation de séries chronologiques et détection d'anomalies pour les données du monde réel.
Capacité avérée à diriger les décisions architecturales, à encadrer les développeurs(ses) et à collaborer avec les équipes chargées des produits, des données et des plateformes.
Connaissance des outils MLOps (Docker, Kubernetes, Weights & Biases, MLflow, SageMaker).
Diplôme de deuxième cycle (master/doctorat) en informatique, apprentissage automatique ou dans un domaine connexe préféré.

Une attention particulière est accordée aux candidats présentant les caractéristiques suivantes:

Expérience avec l'OCR pour l'extraction de données structurées à partir de documents.
Expérience dans la modélisation de séries chronologiques pour la maintenance prédictive et la détection d'anomalies.
Connaissance des systèmes IoT industriels (capteurs, télémétrie, edge computing).
Expérience dans l'application de l'apprentissage par renforcement ou des architectures agentuelles pour les systèmes de prise de décision et de contrôle.
Contributions à des frameworks ML open source ou à la recherche dans les domaines de la fiabilité, de l'explicabilité ou des jumeaux numériques.

Quels sont les avantages pour vous?:

Un salaire compétitif et des opportunités d'équité significatives.
Couverture des soins de santé, des soins dentaires et de la vue.
Programme d'inscription 401(k) / RRSP.
Prenez les congés dont vous avez besoin.
Une culture du travail :

Vous travaillerez avec des personnes intelligentes, humbles et optimistes à travers le monde.
Un environnement méritocratique où les idées et les résultats sont publiquement célébrés.

Qui sommes-nous:

Notre raison d'être est de faciliter la vie des équipes de première ligne et de maintenance en développant des logiciels qui répondent à leurs besoins concrets. Notre produit transforme la manière dont 80 % de la main-d'œuvre mondiale (celle qui ne travaille pas derrière un bureau) gère ses opérations, ses actifs et ses équipes.

MaintainX s'engage à créer un environnement diversifié. Tous les candidats qualifiés seront pris en considération pour un emploi sans considération de race, de couleur, de religion, de sexe, d'identité ou d'expression de genre, d'orientation sexuelle, d'origine nationale, de génétique, d'invalidité, d'âge ou de statut d'ancien combattant.

Ready to apply?

MaintainX

View all jobs →

Scientifique appliqué principal, maintenance prédictive en apprentissage automatique (intelligence des actifs)

MaintainX · Montreal, Toronto, Vancouver

Apply now

Engineering Montreal Posted Apr 24, 2026

MaintainX est la première plateforme mobile au monde dédiée à la gestion des actifs et des tâches dans les environnements industriels et de première ligne. Nous proposons une solution moderne, basée sur l'IoT et le cloud, qui facilite la maintenance, la sécurité et l'exploitation des équipements et installations physiques. Nous aidons plus de 12 000 organisations, dont Duracell, Univar Solutions, Titan America, McDonald's, Brenntag, Cintas, Xylem et Shell, à atteindre l'excellence opérationnelle et la fiabilité à grande échelle. À la suite de notre série D de 150 millions de dollars menée par Bain Capital Ventures, Bessemer Ventures, August Capital, Amity Ventures et Ridge Ventures, MaintainX a levé un total de 254 millions de dollars, valorisant la société à 2,5 milliards de dollars. Alors que nous entrons dans notre prochaine phase de croissance, nous investissons massivement dans l'IA/ML, les LLM et l'IoT industriel afin de transformer le mode de fonctionnement des équipes de première ligne, en prédisant les pannes avant qu'elles ne se produisent, en automatisant les flux de travail et en intégrant l'intelligence dans chaque actif et chaque procédure.

Ce que vous ferez:

Concevoir, développer et optimiser des modèles d’apprentissage automatique (machine learning) pour la détection et la classification de défauts de bout en bout — des choix liés aux données et à l’entraînement jusqu’aux stratégies d’évaluation et aux contraintes de mise en production.
Réaliser des analyses exploratoires de données (EDA) sur des données de vibration, des données OT (technologies opérationnelles) et des séries temporelles afin d’en tirer des insights et d’identifier des motifs révélateurs de défaillances ou d’anomalies.
Mener des expérimentations et évaluer différents algorithmes en modélisation de séries temporelles, en traitement du signal et en méthodes statistiques afin d’optimiser la performance des modèles.
Collaborer avec les chefs de produit (PM) pour la découverte de fonctionnalités et la priorisation de la feuille de route, notamment en validant des hypothèses produit, en définissant des indicateurs de succès et en quantifiant l’impact pour les utilisateurs finaux.
Travailler en étroite collaboration avec des experts du domaine afin de valider les résultats et d’assurer leur pertinence dans des contextes réels.
Participer activement à la communauté interne de pairs pour remettre en question le statu quo, améliorer les méthodes de travail et influencer les décisions d’architecture, tout en contribuant à une culture d’excellence en science appliquée.
Participer aux rotations de garde (on-call).

À propos de vous:

Maîtrise ou doctorat en informatique, science des données, génie mécanique, génie électrique ou dans un domaine connexe, avec une spécialisation en surveillance d’état (condition monitoring) ou en apprentissage automatique.
Plus de 5 ans d’expérience démontrée en programmation avec des outils standards de ML tels que Python, PyTorch, TensorFlow, etc.
Solides connaissances fondamentales en apprentissage automatique, en science des données et en statistiques.
Bonne maîtrise des techniques de modélisation de séries temporelles et de l’ingénierie de caractéristiques (feature engineering).
Capacité à livrer du code prêt pour la production, bien testé, maintenable et validé par des expérimentations rigoureuses.

Atouts:

Expérience pratique dans le développement de modèles appliqués aux données OT et de vibration, à la surveillance d’état et à la détection ou classification de défauts.
Connaissance des techniques de traitement du signal (p. ex. transformée de Fourier, analyse en ondelettes) et de leur application aux données OT et de vibration.

Quels sont les avantages pour vous?:

Un salaire compétitif et des opportunités d'équité significatives.
Couverture des soins de santé, des soins dentaires et de la vue.
Programme d'inscription 401(k) / RRSP.
Prenez les congés dont vous avez besoin.
Une culture du travail :

Vous travaillerez avec des personnes intelligentes, humbles et optimistes à travers le monde.
Un environnement méritocratique où les idées et les résultats sont publiquement célébrés.

Qui sommes-nous:

Notre raison d'être est de faciliter la vie des équipes de première ligne et de maintenance en développant des logiciels qui répondent à leurs besoins concrets. Notre produit transforme la manière dont 80 % de la main-d'œuvre mondiale (celle qui ne travaille pas derrière un bureau) gère ses opérations, ses actifs et ses équipes. MaintainX s'engage à créer un environnement diversifié. Tous les candidats qualifiés seront pris en considération pour un emploi sans considération de race, de couleur, de religion, de sexe, d'identité ou d'expression de genre, d'orientation sexuelle, d'origine nationale, de génétique, d'invalidité, d'âge ou de statut d'ancien combattant.

Ready to apply?

MaintainX

View all jobs →

CR

Machine Learning Engineering Intern

Cresta · Toronto Canada

Apply now

Posted Apr 21, 2026

Cresta is on a mission to turn every customer conversation into a competitive advantage by unlocking the true potential of the contact center. Our platform combines the best of AI and human intelligence to help contact centers discover customer insights and behavioral best practices, automate conversations and inefficient processes, and empower every team member to work smarter and faster. Born from the prestigious Stanford AI lab, Cresta's co-founder and chairman is Sebastian Thrun, the genius behind Google X, Waymo, Udacity, and more. Our leadership also includes CEO, Ping Wu, the co-founder of Google Contact Center AI and Vertex AI platform, and co-founder, Tim Shi, an early member of Open AI.

Join us on this thrilling journey to revolutionize the workforce with AI. The future of work is here, and it's at Cresta.

About the role:

At Cresta, the Knowledge Assist (KA) team develops AI solutions for the contact center industry, focusing on improving agent productivity by providing access to the right knowledge at the right time.

Our current projects:

Generative Knowledge Assist (GenKA): Real-time, context-aware suggestions for contact center agents, integrating with multiple knowledge bases to streamline information retrieval, reduce agent effort, and ensure accurate responses.
Knowledge Search (KS): A search experience tailored for contact center agents. Built on the GenKA stack, it aims to evolve into an enterprise search solution beyond contact centers.
Retrieval-Augmented Generation (RAG): A critical project underpinning GenKA, KS, and VA, enabling efficient retrieval of relevant knowledge content and generating read-to-use agent responses grounded in knowledge.

Our internships offer a dynamic, fast-paced environment where you’ll collaborate with top researchers and engineers in the field. We provide opportunities for interns to make significant contributions to AI research and apply novel techniques at scale.

This is a unique opportunity to shape the future of AI at Cresta by solving complex problems and bringing breakthrough AI advancements into production environments.

Responsibilities:

Design, develop, and deploy Cresta’s KA solutions and proprietary models.
Focus on practical AI challenges such as improving reasoning, and evaluation in real-world scenarios.
Collaborate with cross-functional teams including front-end and back-end software engineers to integrate KA solutions into Cresta’s customer solutions.
Lead initiatives to scale AI systems for production environments, ensuring performance and reliability across use cases.
Contribute to solving cutting-edge problems in AI and help define the future roadmap for Cresta’s KA.
Innovate and research ways to improve security, cost-efficiency, and reliability of AI systems.

Qualifications We Value:

Currently pursuing a Bachelor's or Master's degree in Computer Science, Artificial Intelligence, Machine Learning, or a related technical field.
Proficiency in Python and familiarity with at least one deep learning framework (e.g., PyTorch, TensorFlow).
Strong understanding of machine learning fundamentals and generative modeling.
Ability to design and analyze experiments involving large-scale datasets.
Work authorization in the country of employment at the time of hire.

Perks & Benefits:

$45-$70 per hour subject to taxes
Lunch can be expensed (up to $25) while working in the office.
PTO: 4 days

Compensation for this position includes a base salary, equity, and a variety of benefits. Actual base salaries will be based on candidate-specific factors, including experience, skillset, and location, and local minimum pay requirements as applicable. We are actively hiring for this role in the US and Canada. Your recruiter can provide further details.

This posting will be used to fill a newly-created role.

We have noticed a rise in recruiting impersonations across the industry, where scammers attempt to access candidates' personal and financial information through fake interviews and offers. All Cresta recruiting email communications will always come from the @cresta.ai domain. Any outreach claiming to be from Cresta via other sources should be ignored. If you are uncertain whether you have been contacted by an official Cresta employee, reach out to recruiting@cresta.ai

Ready to apply?

Apply to Cresta

CR

Cresta

View all jobs →

Senior ML Systems Engineer

Cerebras Systems · Sunnyvale CA or Toronto Canada

Apply now

Software US and Canada Offices Posted Apr 15, 2026

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs.

Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference.

Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation.

About the Role
We are seeking a versatile and experienced engineer to join our SOTA Training Platform team. This team is responsible to rapidly bring up state-of-the-art open-source models (like LLaMA, Qwen, etc) or customer-provided proprietary models on our Cerebras CSX systems. Success in this role requires a system-minded generalist who thrives in fast-paced bringup environments and is comfortable working across the entire Cerebras software stack.
Your work will play a critical role in achieving unprecedented levels of performance, efficiency, and scalability for AI applications.

Responsibilities

Contribute to the end-to-end bring up of ML models on Cerebras CSX systems.
Work across the stack: model architecture translation, graph lowering, compiler optimizations, runtime integration, and performance tuning.
Debug performance and correctness issues spanning model code, compiler IRs, runtime behavior, and hardware utilization.
Propose and prototype improvements across tools, APIs, or automation flows to accelerate future bring ups.
Study emerging training and post-training algorithms and map to Cerebras software architecture and hardware.

Skills & Qualifications

Bachelor’s, Master’s, or PhD in Computer Science, Engineering, or a related field.
5+ years of relevant industry experience (internship/co-op experience included)
Comfort navigating the full AI toolchain: Python modeling code, compiler IRs, performance profiling, etc.
Strong debugging skills across performance, numerical accuracy, and runtime integration.
Experience with deep learning frameworks (e.g., PyTorch, TensorFlow) and familiarity with model internals (e.g., attention, MoE, diffusion).
Proficiency in C/C++ programming and experience with low-level optimization.
Proven experience in compiler development, particularly with LLVM and/or MLIR.
Strong background in optimization techniques, particularly those involving NP-hard problems.
Familiarity with large scale ML systems and state of the art algorithms, including model training and reinforcement learning.

What We Offer

Competitive salary and benefits package.
Opportunities for professional growth and career advancement.
A dynamic and innovative work environment.
The chance to work on cutting-edge technologies and make a significant impact on the future of AI.

Why Join Cerebras

People who are serious about software make their own hardware. At Cerebras we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we’ve reached an inflection point in our business. Members of our team tell us there are five main reasons they joined Cerebras:

Build a breakthrough AI platform beyond the constraints of the GPU.
Publish and open source their cutting-edge AI research.
Work on one of the fastest AI supercomputers in the world.
Enjoy job stability with startup vitality.
Our simple, non-corporate work culture that respects individual beliefs.

Read our blog: Five Reasons to Join Cerebras in 2026.

Apply today and become part of the forefront of groundbreaking advancements in AI!

Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.

This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice.

Ready to apply?

Cerebras Systems

View all jobs →

Staff Inference ML Runtime Engineer

Cerebras Systems · Sunnyvale CA or Toronto Canada

Apply now

AI Cloud Headquarters/Sunnyvale Office Toronto Office Posted Apr 15, 2026

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs.

Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference.

Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation.

About The Role

The Inference ML Engineering team at Cerebras Systems is dedicated to enabling our fast generative inference solution through simple APIs powered by a distributed runtime that runs on large clusters of our own hardware. Our mission is to empower enterprises, developers, and researchers to unlock the full potential of our platform, leveraging its performance, scalability, and flexibility. The team works closely with cross-functional groups, including compiler developers, cluster orchestrators, ML scientists, cloud architects, and product teams, to deliver high-impact solutions that redefine the boundaries of ML performance and usability.

As a Senior Software Engineer on the Inference ML Engineering team, you will play a key role in designing and implementing APIs, ML features, and tools that enable running state-of-the-art generative AI models on our custom hardware. You will architect solutions that enable seamless model translation and execution, ensuring high throughput and low latency, while maintaining ease of use. Your responsibilities will include leading technical initiatives, collaborating with other engineering teams to enhance the developer experience, enabling key ML features at scale, maintaining our speed advantage, achieving high throughput, and supporting a wide range of ML workloads. This role offers an opportunity to shape the evolution of our ML ecosystem while tackling complex technical challenges at the intersection of machine learning, software, and hardware.

Responsibilities

Drive and provide technical guidance to a team of software engineers working on complex machine learning integration projects.
Design and implement ML features (e.g., structured outputs, biased sampling, predicted outputs) that improve performance of generative AI models at inference time.
Design and implement high-throughput, low-latency multimodal inference models that support delivery of image, audio, and video inputs and outputs.
Maintain our scalable serving backend for handling many concurrent requests per minute.
Scale our inference service by implementing detailed observability throughout the entire stack.
Analyze and improve latency, throughput, memory usage, and compute efficiency on the service and the implementation of various features.
Optimize software to accelerate generative LLM inference by achieving high throughput and low latency.
Stay up-to-date with advancements in machine learning and deep learning, and apply state-of-the-art techniques to enhance our solutions.
Evaluate trade-offs between different approaches, clearly articulate design choices, and develop detailed proposals for implementing new features.
Uncover, scope, and prioritize significant areas of technical debt across the software stack to ensure continued high quality of the inference service.
Build and maintain robust automated test suites to ensure software quality, performance, and reliability.
Contribute to an agile team environment by delivering high-quality software and adhering to agile development practices.
Lead cross-functional initiative across the company to deliver high-quality inference solutions.

Skills and Qualifications

Bachelor’s, Master’s, or PhD in Computer Science, Computer Engineering, Mathematics, or a related field.
8+ years of experience in large-scale software engineering, with a focus on deep learning or related domains.
Proficiency in Python for building and maintaining scalable systems.
Advanced proficiency in C++, with an emphasis on multi-threaded programming, performance optimization, and system-level development.
Demonstrated experience driving cross-functional projects.
Experience building and scaling large-scale inference systems for LLMs or multimodal models.
Familiarity with LLM serving frameworks, such as vLLM, SGLang, and TensorRT-LLM.
Solid understanding of software architectural patterns for large-scale, high-performance applications.
Hands-on experience with ML frameworks, such as PyTorch, and a strong understanding of their underlying architectures.
Strong problem-solving skills, with the ability to balance technical depth with practical implementation constraints.
Exceptional communication and presentation skills, with the ability to work both independently and collaboratively across multidisciplinary teams.

Why Join Cerebras

People who are serious about software make their own hardware. At Cerebras we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we’ve reached an inflection point in our business. Members of our team tell us there are five main reasons they joined Cerebras:

Build a breakthrough AI platform beyond the constraints of the GPU.
Publish and open source their cutting-edge AI research.
Work on one of the fastest AI supercomputers in the world.
Enjoy job stability with startup vitality.
Our simple, non-corporate work culture that respects individual beliefs.

Read our blog: Five Reasons to Join Cerebras in 2026.

Apply today and become part of the forefront of groundbreaking advancements in AI!

Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.

This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice.

Ready to apply?

Cerebras Systems

View all jobs →

Performance Engineer

Cerebras Systems · Toronto, Ontario, Canada

Apply now

Performance Toronto Office Posted Apr 15, 2026

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs.

Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference.

Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation.

About The Role

Join Cerebras as a Performance Engineer within our innovative Runtime Team. Our groundbreaking CS-3system, hosted by a distributed set of modern and powerful x86 machines, has set new benchmarks in high-performance ML training and inference solutions. It leverages a dinner-plate sized chip with 44GB of on-chip memory to surpass traditional hardware capabilities. This role will challenge and expand your expertise in optimizing AI applications and managing computational workloads primarily on the x86 architecture that run our Runtime driver.

Responsibilities

Focus on CPU and memory subsystem optimizations for our Runtime software driver, enabling faster key cloud and ML training/inference workloads across modern x86 machines that form the backbone of our AI accelerator.
Develop and enhance algorithms for efficient data movement, local data processing, job submission, and synchronization between various software and hardware components.
Optimize our workloads using advanced CPU features like AVX instructions, prefetch mechanisms, and cache optimization techniques.
Perform performance profiling and characterization using tools such as AMD uprof, and reduce OS level overheads.
Influence the design of Cerebras' next-generation AI architectures and software stack by analyzing the integration of advanced CPU features and their impact on system performance and computational efficiency.
Engage directly with the AI and ML developer community to understand their needs and solve contemporary challenges with innovative solutions.
Collaborate with multiple teams within Cerebras, including architecture, research, and product management, to elevate our computational platform and influence future designs.

Skills & Qualifications

BS, MS, or PhD in Computer Science, Computer Engineering, or a related field.
5+ years of relevant experience in performance engineering, particularly in optimizing algorithms and software design.
Strong proficiency in C/C++ and familiarity with Python or other scripting languages.
Demonstrated experience with memory subsystem optimizations and system-level performance tuning.
Experience with distributed systems is highly desirable, as it is crucial to optimizing the performance of our Runtime software across multiple x86 hosts.
Familiarity with compiler technologies (e.g., LLVM, MLIR) and with PyTorch and other ML frameworks.

Why Join Cerebras

People who are serious about software make their own hardware. At Cerebras we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we’ve reached an inflection point in our business. Members of our team tell us there are five main reasons they joined Cerebras:

Build a breakthrough AI platform beyond the constraints of the GPU.
Publish and open source their cutting-edge AI research.
Work on one of the fastest AI supercomputers in the world.
Enjoy job stability with startup vitality.
Our simple, non-corporate work culture that respects individual beliefs.

Read our blog: Five Reasons to Join Cerebras in 2026.

Apply today and become part of the forefront of groundbreaking advancements in AI!

Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.

This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice.

Ready to apply?

Cerebras Systems

View all jobs →

Senior Runtime Engineer

Cerebras Systems · Sunnyvale CA or Toronto Canada

Apply now

Software Headquarters/Sunnyvale Office Toronto Office Posted Apr 15, 2026

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs.

Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference.

Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation.

About The Role

We are building the next generation of large-scale AI systems that power training and inference workloads at unprecedented scale and efficiency.

You will design and develop high-performance distributed software that orchestrates massive compute and data pipelines across heterogeneous clusters. Your work will push the limits of concurrency, throughput, and scalability—enabling efficient execution of models at massive scale. This role sits at the intersection of systems engineering and machine learning performance, demanding both architectural depth and low-level implementation skills. You will help shape how models are executed and optimized end-to-end, from data ingestion to distributed execution, across cutting-edge hardware platforms.

We’re hiring for runtime roles across both Training and Inference.

Responsibilities

Design and implement distributed runtime components to efficiently manage large-scale execution workloads.
Develop and optimize high-performance data and communication pipelines that fully utilize CPU, memory, storage, and network resources.
Enable scalable execution across multiple compute nodes, ensuring high concurrency and minimal bottlenecks.
Collaborate closely with ML and compiler teams to integrate new model architectures, training regimes, and hardware-specific optimizations.
Diagnose and resolve complex performance issues across the software stack using profiling and instrumentation tools.
Contribute to overall system design, architecture reviews, and roadmap planning for large-scale AI workloads.

Skills & Qualifications

3+ years of experience developing high-performance or distributed system software.
Strong programming skills in C/C++, with expertise in multi-threading, memory management, and performance optimization.
Experience with distributed systems, networking, or inter-process communication.
Solid understanding of data structures, concurrency, and system-level resource management (CPU, I/O, and memory).
Proven ability to debug, profile, and optimize code across scales—from threads to clusters.
Bachelor’s, Master’s, or equivalent experience in Computer Science, Electrical Engineering, or related field.

Preferred Skills & Qualifications

Familiarity with machine learning training or inference pipelines, especially distributed training and large-model scaling.
Exposure to Python and PyTorch, particularly in the context of model training or performance tuning.
Experience with compiler internals, custom hardware interfaces, or low-level protocol design.
Prior work on high-performance clusters, HPC systems, or custom hardware/software co-design.
Deep curiosity about how to unlock new levels of performance for large-scale AI workloads.

Why Join Cerebras

People who are serious about software make their own hardware. At Cerebras we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we’ve reached an inflection point in our business. Members of our team tell us there are five main reasons they joined Cerebras:

Build a breakthrough AI platform beyond the constraints of the GPU.
Publish and open source their cutting-edge AI research.
Work on one of the fastest AI supercomputers in the world.
Enjoy job stability with startup vitality.
Our simple, non-corporate work culture that respects individual beliefs.

Read our blog: Five Reasons to Join Cerebras in 2026.

Apply today and become part of the forefront of groundbreaking advancements in AI!

Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.

This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice.

Ready to apply?

Cerebras Systems

View all jobs →

Advanced Technology: AI/ML Research Scientist

Cerebras Systems · Sunnyvale, CA; Toronto, Ontario, Canada; Vancouver, British Columbia, Canada

Apply now

Advanced Technology Headquarters/Sunnyvale Office Toronto Office Vancouver Office Posted Apr 15, 2026

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs.

Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference.

Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation.

About The Team

Cerebras builds wafer-scale AI processors—single chips delivering tens of PB/s of memory bandwidth and a dataflow architecture that accelerates at a granularity no multi-device system can match. The Advanced Technology Group (ATG) is Cerebras’ pathfinding organization. We work ahead of product to explore new architectures, demonstrate breakthrough performance on scientific and AI workloads, and shape the technical roadmap for future Cerebras hardware and software. Our work regularly appears at top-tier venues (Supercomputing, SIAM, IEEE, and NeurIPS) and directly influences the design of next-generation wafer-scale systems.

About The Role

Most AI research today is shaped by the constraints of existing hardware. This role starts from the other direction: what would you build if the architecture let you rethink the fundamentals? You will design and develop AI models and training methodologies on wafer-scale hardware, working at the level of optimization theory, model architecture, and statistical foundations rather than assembling existing components.

The ATG sits at the intersection of AI, computational science, and computer architecture, and your work will draw on all three. You will collaborate closely with Cerebras’ ASIC, compiler, kernel, and AI teams as well as external partners at universities and national laboratories.

What You Will Do

Design AI models and training methods from first principles, leveraging architectural properties of wafer-scale hardware that are unavailable on conventional platforms.
Investigate how techniques from computational science—numerical methods, PDE solvers, simulation—can inform and advance AI model design, and explore hybrid workflows that couple simulation and learning.
Develop a deep understanding of the hardware substrate and use it to guide algorithmic choices: model structure, optimization strategy, memory access patterns, numerical precision.
Publish findings and present at top-tier venues (NeurIPS, ICML, ICLR, etc.); represent Cerebras in the broader AI/ML research community.
Inform the design of future Cerebras hardware and software by identifying the computational patterns that matter most for next-generation AI workloads.

What We Are Looking For

PhD in Machine Learning, Computer Science, Applied Mathematics, Statistics, Physics, or a related quantitative field preferred; exceptional candidates without a graduate degree who demonstrate equivalent depth through published research, significant open-source contributions, or a strong industry track record are encouraged to apply.
Mathematical maturity: comfort with the theory behind gradient methods, loss landscapes, generalization, and the relationship between model structure and data statistics.
Track record of published research at top-tier AI or computational science venues.
Proficiency in Python and PyTorch; comfort with C or other low-level languages is a strong signal.
Excellent communication and interpersonal skills: able to present complex technical material to both ML and systems audiences, and to collaborate effectively in a fast-paced, small-team environment.

Why This Opportunity Is Exciting And Unique

You will have direct access to hardware that changes what’s algorithmically possible. Tens of PB/s of memory bandwidth and fine-grained dataflow execution open design spaces that don’t exist on GPU clusters.
You will work alongside researchers in computational science, computer architecture, and performance engineering. The synthesis across these fields is central to ATG’s approach.
Your research will influence silicon - ATG’s findings directly shape the design of future Cerebras chips and systems.

We are hiring for multiple positions across experience levels. If this work resonates, we encourage you to apply.

Why Join Cerebras

People who are serious about software make their own hardware. At Cerebras we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we’ve reached an inflection point in our business. Members of our team tell us there are five main reasons they joined Cerebras:

Build a breakthrough AI platform beyond the constraints of the GPU.
Publish and open source their cutting-edge AI research.
Work on one of the fastest AI supercomputers in the world.
Enjoy job stability with startup vitality.
Our simple, non-corporate work culture that respects individual beliefs.

Read our blog: Five Reasons to Join Cerebras in 2026.

Apply today and become part of the forefront of groundbreaking advancements in AI!

Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.

This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice.

Ready to apply?

Cerebras Systems

View all jobs →

Applied Machine Learning Research Scientist

Cerebras Systems · Sunnyvale CA or Toronto Canada

Apply now

AppliedML Headquarters/Sunnyvale Office Toronto Office Posted Apr 15, 2026

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs.

Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference.

Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation.

About The Role

As an Applied Machine Learning Research Scientist at Cerebras, you will play a key role in turning modern machine learning techniques into scalable, high-performance systems. This role sits at the intersection of modeling and systems focused not on publishing new algorithms, but on understanding how they work and making them run effectively at scale. Your work will directly impact how large language models (LLMs) are trained, optimized, and deployed on one of the most advanced AI platforms in the world.

You will work closely with researchers and senior engineers to implement and improve workflows for LLM pretraining, fine-tuning, and reinforcement learning-based post-training. This includes building training pipelines, debugging complex system behaviors, improving model quality, and iterating on data and evaluation strategies. Your contributions will help translate cutting-edge ML ideas into reliable, production-ready systems that solve real-world problems.

This role is ideal for candidates who enjoy hands-on engineering, want to build deep intuition for ML systems, and are excited about working on LLMs and reinforcement learning in practice, not just in theory.

Responsibilities

Apply post-training techniques (e.g. RLVR, RLHF, GRPO etc.) techniques to improve model performance.
Build and maintain evaluation pipelines to measure model performance across tasks and domains.
Debug issues across the ML stack, including data pipelines, training jobs, model outputs and mixed or lower precision computation.
Collaborate with researchers to translate ML ideas into efficient, scalable implementation.
Design, implement, and scale ML pipelines across all stages of LLM development (pretraining, fine-tuning, alignment).
Work with large datasets, including dataset generation, filtering, and synthetic data approaches.
Optimize training and inference workflows for performance, efficiency, and reliability.
Contribute high-quality, maintainable code to shared ML infrastructure.

Skills & Qualifications

Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
0 - 5 years of experience (including internships, research, or industry experience) working with machine learning systems; we are hiring multiple positions for various levels.
Strong programming skills in Python.
Experience with ML frameworks such as PyTorch.
Solid understanding of machine learning fundamentals.
Familiarity with deep learning architectures, particularly transformers.
Ability to read and understand modern ML papers and implement key ideas.

Preferred Skills & Qualifications

Experience working with large language models (training, fine-tuning, and evaluation).
Familiarity with reinforcement learning concepts.
Experience with distributed training frameworks (e.g., FSDP, Megatron).
Experience working with large-scale datasets and data pipelines.
Experience debugging or optimizing ML systems for performance.
• Contributions to meaningful codebases, projects, or open-source systems

Why Join Cerebras

People who are serious about software make their own hardware. At Cerebras we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we’ve reached an inflection point in our business. Members of our team tell us there are five main reasons they joined Cerebras:

Build a breakthrough AI platform beyond the constraints of the GPU.
Publish and open source their cutting-edge AI research.
Work on one of the fastest AI supercomputers in the world.
Enjoy job stability with startup vitality.
Our simple, non-corporate work culture that respects individual beliefs.

Read our blog: Five Reasons to Join Cerebras in 2026.

Apply today and become part of the forefront of groundbreaking advancements in AI!

Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.

This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice.

Ready to apply?

Cerebras Systems

View all jobs →

Engineering Manager, Inference ML Runtime

Cerebras Systems · Sunnyvale CA or Toronto Canada

Apply now

Software Headquarters/Sunnyvale Office Toronto Office Posted Apr 15, 2026

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs.

Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference.

Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation.

About the Role

The Inference ML Engineering team at Cerebras builds the runtime, APIs, and systems that power the fastest generative AI inference platform in the world.

As an Engineering Manager, Inference ML Runtime, you will lead a team responsible for designing and scaling the systems that enable seamless execution of state-of-the-art AI models on Cerebras hardware. You will operate at the intersection of machine learning, distributed systems, and high-performance runtime engineering, translating cutting-edge research into production-ready infrastructure to serve a variety of text-only and multimodal models.

This role combines technical leadership, people management, and execution ownership, with direct impact on Cerebras’ core inference platform.

What You’ll Do

Technical Leadership

Own the architecture and evolution of the ML inference runtime and serving systems.
Guide the design of:

high-throughput, low-latency inference pipelines;
multimodal model execution (text, image, audio, video);
scalable serving infrastructure for concurrent workloads.

Partner with cloud, compiler, core runtime, hardware, and ML teams to optimize end-to-end performance.

Team Leadership

Build, manage, and grow a team of ML systems and infrastructure engineers.
Provide technical direction, mentorship, and career development.
Foster a culture of ownership, velocity, and engineering excellence.
Recruit top talent in ML systems, distributed systems, and runtime engineering.

Execution & Delivery

Drive execution of complex, cross-functional initiatives across:

ML engineering;
compiler/runtime teams;
cloud and infrastructure teams.

Own delivery of features such as:

advanced inference capabilities (structured outputs, sampling strategies);
heterogeneous model types, including test and multimodal;
performance optimization (latency, throughput, memory efficiency);
observability and reliability across the inference stack.

Ensure high-quality releases through strong testing, validation, and operational rigor.

Platform & Performance Ownership

Scale Cerebras’ inference platform to handle large volumes of concurrent requests at very fast speed
Drive improvements in:

latency;
throughput;
compute efficiency.

Identify and prioritize technical debt and system bottlenecks.
Maintain Cerebras’ industry-leading inference speed advantage.

Cross-Functional Collaboration

Partner with:

ML researchers (model enablement);
compiler teams (model execution optimization);
cloud/platform teams (deployment and scaling).

Act as a bridge between research, infrastructure, and production systems.

What You Bring

Required

8+ years of experience in:

large-scale software engineering;
ML systems or distributed systems.

2+ years of engineering management experience.
Strong programming skills in:

Python (production systems);
C++ (performance-critical systems).

Experience building and scaling large-scale inference systems (LLMs or multimodal).
Experience working with cloud infrastructures and following best-practices for building scalable microservices and applications.

Preferred

Experience with:

LLM serving frameworks (e.g., vLLM, TensorRT-LLM, SGLang);
PyTorch and deep learning frameworks;
distributed systems and high-performance computing.

Familiarity with:

ML runtime systems;
model execution pipelines;
performance optimization for AI workloads.

Why This Role Matters

This team is central to Cerebras’ mission of delivering the fastest AI inference in the world. Your work will directly enable real-time AI applications and unlock new capabilities across enterprise and frontier AI use cases.

Why Join Cerebras

People who are serious about software make their own hardware. At Cerebras we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we’ve reached an inflection point in our business. Members of our team tell us there are five main reasons they joined Cerebras:

Build a breakthrough AI platform beyond the constraints of the GPU.
Publish and open source their cutting-edge AI research.
Work on one of the fastest AI supercomputers in the world.
Enjoy job stability with startup vitality.
Our simple, non-corporate work culture that respects individual beliefs.

Read our blog: Five Reasons to Join Cerebras in 2026.

Apply today and become part of the forefront of groundbreaking advancements in AI!

Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.

This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice.

Ready to apply?

Cerebras Systems

View all jobs →

Full Stack LLM Engineer

Cerebras Systems · Toronto, Ontario, Canada

Apply now

Software Toronto Office Posted Apr 15, 2026

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs.

Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference.

Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation.

About the Role
We are seeking a versatile and experienced engineer to join our Inference Core Model Bringup team. This team is responsible to rapidly bring up state-of-the-art open-source models (like LLaMA, Qwen, etc) or customer-provided proprietary models on our Cerebras CSX systems. Success in this role requires a system-minded generalist who thrives in fast-paced bringup environments and is comfortable working across the entire Cerebras software stack.
Your work will play a critical role in achieving unprecedented levels of performance, efficiency, and scalability for AI applications.

Responsibilities

Contribute to the end-to-end bring up of ML models on Cerebras CSX systems.
Work across the stack: model architecture translation, graph lowering, compiler optimizations, runtime integration, and performance tuning.
Debug performance and correctness issues spanning model code, compiler IRs, runtime behavior, and hardware utilization.
Propose and prototype improvements across tools, APIs, or automation flows to accelerate future bring ups.

Skills & Qualifications

Bachelor’s, Master’s, or PhD in Computer Science, Engineering, or a related field.
Comfort navigating the full AI toolchain: Python modeling code, compiler IRs, performance profiling, etc.
Strong debugging skills across performance, numerical accuracy, and runtime integration.
Experience with deep learning frameworks (e.g., PyTorch, TensorFlow) and familiarity with model internals (e.g., attention, MoE, diffusion).
Proficiency in C/C++ programming and experience with low-level optimization.
Proven experience in compiler development, particularly with LLVM and/or MLIR.
Strong background in optimization techniques, particularly those involving NP-hard problems.

What We Offer

Competitive salary and benefits package.
Opportunities for professional growth and career advancement.
A dynamic and innovative work environment.
The chance to work on cutting-edge technologies and make a significant impact on the future of AI.

Why Join Cerebras

People who are serious about software make their own hardware. At Cerebras we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we’ve reached an inflection point in our business. Members of our team tell us there are five main reasons they joined Cerebras:

Build a breakthrough AI platform beyond the constraints of the GPU.
Publish and open source their cutting-edge AI research.
Work on one of the fastest AI supercomputers in the world.
Enjoy job stability with startup vitality.
Our simple, non-corporate work culture that respects individual beliefs.

Read our blog: Five Reasons to Join Cerebras in 2026.

Apply today and become part of the forefront of groundbreaking advancements in AI!

Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.

This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice.

Ready to apply?

Cerebras Systems

View all jobs →

Kernel Engineer

Cerebras Systems · Sunnyvale CA or Toronto Canada

Apply now

Software Headquarters/Sunnyvale Office Toronto Office Posted Apr 15, 2026

Cerebras Systems builds the world's largest AI chip, 56 times larger than GPUs. Our novel wafer-scale architecture provides the AI compute power of dozens of GPUs on a single chip, with the programming simplicity of a single device. This approach allows Cerebras to deliver industry-leading training and inference speeds and empowers machine learning users to effortlessly run large-scale ML applications, without the hassle of managing hundreds of GPUs or TPUs.

Cerebras' current customers include top model labs, global enterprises, and cutting-edge AI-native startups. OpenAI recently announced a multi-year partnership with Cerebras, to deploy 750 megawatts of scale, transforming key workloads with ultra high-speed inference.

Thanks to the groundbreaking wafer-scale architecture, Cerebras Inference offers the fastest Generative AI inference solution in the world, over 10 times faster than GPU-based hyperscale cloud inference services. This order of magnitude increase in speed is transforming the user experience of AI applications, unlocking real-time iteration and increasing intelligence via additional agentic computation.

About The Role

As a Kernel Engineer on our team, you will develop high-performance software solutions at the intersection of hardware and software, developing high-performance software for cutting-edge AI and HPC workloads. Your focus will be on implementing, optimizing, and scaling deep learning operations to fully leverage our custom, massively parallel processor architecture.

You will be part of a world-class team responsible for the design, performance tuning, and validation of foundational ML and HPC kernels. This includes building a library of parallel and distributed algorithms that maximize compute utilization and push the boundaries of training efficiency for state-of-the-art AI models. Your work will be critical to unlocking the full potential of our hardware and accelerating the pace of AI innovation.

Responsibilities

Develop design specifications for new machine learning and linear algebra kernels and mapping to the Cerebras WSE System using various parallel programming algorithms.
Develop and debug kernel library of highly optimized low level assembly instruction and C-like domain specific language routines to implement algorithms targeting the Cerebras hardware system.
Develop and debug high-performance kernel routines in low-level assembly and a custom C-like (CSL) language, implementing algorithms optimized for the Cerebras hardware system.
Using mathematical models and analysis to measure the software performance and inform design decisions.
Develop and integrate unit and system testing methodologies to verify correct functionality and performance of kernel libraries.
Study emerging trends in Machine Learning applications and help evolve Kernel library architecture to address computational challenges of the start-of-the-art Neural Networks.
Interact with chip and system architects to optimize instruction sets, microarchitecture, and IO of next generation systems.

Skills And Qualifications

Bachelor’s, Master’s, PhD or foreign equivalents in Computer Science, Computer Engineering, Mathematics, or related fields.
Understanding of hardware architecture concepts — must be comfortable learning the details of a new hardware architecture.
Skilled in C++ and Python programming languages.
Good knowledge of library and/or API development best practices.
Strong debugging skills and knowledge of debugging complex software stack.

Preferred Skills And Qualifications

Experience in kernel development and/or testing.
Familiarity with parallel algorithms and distributed memory systems.
Experience in programming accelerators such as GPUs and FPGAs.
Familiarity with Machine Learning neural networks and frameworks such as TensorFlow and PyTorch.
Familiarity with HPC kernels and their optimization.

Why Join Cerebras

People who are serious about software make their own hardware. At Cerebras we have built a breakthrough architecture that is unlocking new opportunities for the AI industry. With dozens of model releases and rapid growth, we’ve reached an inflection point in our business. Members of our team tell us there are five main reasons they joined Cerebras:

Build a breakthrough AI platform beyond the constraints of the GPU.
Publish and open source their cutting-edge AI research.
Work on one of the fastest AI supercomputers in the world.
Enjoy job stability with startup vitality.
Our simple, non-corporate work culture that respects individual beliefs.

Read our blog: Five Reasons to Join Cerebras in 2026.

Apply today and become part of the forefront of groundbreaking advancements in AI!

Cerebras Systems is committed to creating an equal and diverse environment and is proud to be an equal opportunity employer. We celebrate different backgrounds, perspectives, and skills. We believe inclusive teams build better products and companies. We try every day to build a work environment that empowers people to do their best work through continuous learning, growth and support of those around them.

This website or its third-party tools process personal data. For more details, click here to review our CCPA disclosure notice.

Ready to apply?

Cerebras Systems

View all jobs →

Senior Applied AI/ML Scientist - Search Ads

Faire · San Francisco, CA

Apply now

Algorithms & Data San Francisco, CA Kitchener-Waterloo, ON Toronto, ON Posted Apr 8, 2026

About Faire

Faire is a technology wholesale platform built on the belief that the future is local. Independent retailers around the globe collectively represent a multi-hundred-billion-dollar wholesale market that has historically been fragmented and offline. At Faire, we're using the power of tech, data, and machine learning to connect this thriving community of entrepreneurs across the globe. Picture your favorite boutique in town — we help them discover the best products from around the world to sell in their stores. With the right tools and insights, we believe that we can level the playing field so businesses can grow and local communities can thrive.

We’re looking for smart, resourceful and passionate people to join us as we power the shop local movement. If you believe in community, come join ours.

About this Role

The Ads Data team is building the next generation of advertising products for the wholesale industry. As a key member of this team, you’ll help drive the ML algorithm strategy and system design behind one of the most critical levers for customer value and company growth—Search Ads. You’ll lead the advancement of real-time systems that decide which ads to show for a query, where to place them, and how to optimize for relevance, marketplace health, and advertiser outcomes. This role mirrors many of the technical expectations of Faire’s organic Search roles (modern NLP/LLMs, query understanding, real-time ranking), while operating in an ads environment with auctions, budgets, and pacing constraints.

You’ll operate at the forefront of algorithms—combining large language models, natural language processing, query understanding, deep learning, and structured behavioral data to deliver highly relevant sponsored results for any given query.

What You'll Do

Own and evolve the Search Ads relevance stack—spanning query understanding, targeting, candidate generation, multi-stage ranking, and calibration—while meeting stringent latency and reliability goals.
Design and productionize ML models that improve sponsored-result relevance and personalization, using a blend of unstructured signals (text/images where applicable), LLM-based representations, and structured marketplace features.
Partner closely with engineering and product to connect relevance improvements to ads marketplace outcomes (e.g., conversion, long-term retailer experience, advertiser ROI), using rigorous offline evaluation and online experiments.
As an early member of the Ads Data team, help define its roadmap and technical culture, leveraging deep product intuition to shape what ads at Faire should be—not just how they’re built.
Work in a fast-paced, collaborative environment with team members who’ve shipped ML at top tech companies (e.g. Uber, Airbnb, Meta, Amazon, Pinterest).

Qualifications

4+ years of experience building and shipping ML systems in production, with meaningful experience in search, recommendation, or ads ranking/retrieval.
Hands-on experience with modern deep learning tooling (e.g., PyTorch) and familiarity with vector search / embedding-based retrieval concepts (e.g., Faiss, ScaNN, Pinecone).
A strong track record of productionizing models that blend LLMs (e.g., BERT / GPT-class) with structured features to drive relevance and personalization.
Product-focused mindset and a bias toward execution—moving quickly from research to measurable user and business impact.
Strong communication skills and the ability to work with others in a closely collaborative team environment

Great to Haves

Highly recommended: Master’s or PhD in Computer Science, Statistics, or related STEM fields
Ability to quickly implement state of the art algorithms from an academic paper

Salary Range

San Francisco: the pay range for this role is $196,000 to $269,500 per year.

This role will also be eligible for equity and benefits. Actual base pay will be determined based on permissible factors such as transferable skills, work experience, market demands, and primary work location. The base pay range provided is subject to change and may be modified in the future.

Hybrid Faire employees currently go into the office 3 days per week on Tuesdays, Thursdays, and a third flex day of their choosing (Monday, Wednesday, or Friday). Additionally, hybrid in-office roles will have the flexibility to work remotely up to 4 weeks per year. Specific Workplace and Information Technology positions may require onsite attendance 5 days per week as will be indicated in the job posting.

Why you’ll love working at Faire

Move fast: You'll own meaningful problems that serve customers around the globe with the agency to move fast and see your results clearly.
Equipped to scale: We invest in what matters, including the latest enterprise AI tools, to help you work smarter and get more out of every day.
Best in class: Our team is full of sharp, kind, and generous colleagues who care about their craft and about helping you grow in yours.
Real rewards. Competitive pay, equity, and comprehensive benefits designed to support your life inside and outside of work.
Belonging: We're intentional about building an environment where every Faire employee has equal access to opportunities, growth, and success.

Faire was founded in 2017 by a team of early product and engineering leads from Square. We’re backed by some of the top investors in retail and tech including: Y Combinator, Lightspeed Venture Partners, Forerunner Ventures, Khosla Ventures, Sequoia Capital, Founders Fund, and DST Global. We have headquarters in San Francisco and Kitchener-Waterloo, and a global employee presence across offices in Toronto, London, and New York. To learn more about Faire and our customers, you can read more on our blog.

Faire provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, sex, national origin, age, disability, genetics, sexual orientation, gender identity or gender expression.

Faire is committed to providing access, equal opportunity and reasonable accommodation for individuals with disabilities in employment, its services, programs, and activities. Accommodations are available throughout the recruitment process and applicants with a disability may request to be accommodated throughout the recruitment process. We will work with all applicants to accommodate their individual accessibility needs. To request reasonable accommodation, please fill out our Accommodation Request Form (https://bit.ly/faire-form)

Privacy

For information about the type of personal data Faire collects from applicants, as well as your choices regarding the data collected about you, please visit Faire’s Privacy Notice (https://www.faire.com/privacy)

Ready to apply?

Faire

View all jobs →

Applied AI / ML Scientist - Search Ads

Faire · San Francisco, CA

Apply now

Algorithms & Data San Francisco, CA Kitchener-Waterloo, ON Toronto, ON Posted Apr 8, 2026

About Faire

Faire is a technology wholesale platform built on the belief that the future is local. Independent retailers around the globe collectively represent a multi-hundred-billion-dollar wholesale market that has historically been fragmented and offline. At Faire, we're using the power of tech, data, and machine learning to connect this thriving community of entrepreneurs across the globe. Picture your favorite boutique in town — we help them discover the best products from around the world to sell in their stores. With the right tools and insights, we believe that we can level the playing field so businesses can grow and local communities can thrive.

We’re looking for smart, resourceful and passionate people to join us as we power the shop local movement. If you believe in community, come join ours.

About this Role

The Ads Data team is building the next generation of advertising products for the wholesale industry. As a key member of this team, you’ll help drive the ML algorithm strategy and system design behind one of the most critical levers for customer value and company growth—Search Ads. You’ll lead the advancement of real-time systems that decide which ads to show for a query, where to place them, and how to optimize for relevance, marketplace health, and advertiser outcomes. This role mirrors many of the technical expectations of Faire’s organic Search roles (modern NLP/LLMs, query understanding, real-time ranking), while operating in an ads environment with auctions, budgets, and pacing constraints.

You’ll operate at the forefront of algorithms—combining large language models, natural language processing, query understanding, deep learning, and structured behavioral data to deliver highly relevant sponsored results for any given query.

What You'll Do

Own and evolve the Search Ads relevance stack—spanning query understanding, targeting, candidate generation, multi-stage ranking, and calibration—while meeting stringent latency and reliability goals.
Design and productionize ML models that improve sponsored-result relevance and personalization, using a blend of unstructured signals (text/images where applicable), LLM-based representations, and structured marketplace features.
Partner closely with engineering and product to connect relevance improvements to ads marketplace outcomes (e.g., conversion, long-term retailer experience, advertiser ROI), using rigorous offline evaluation and online experiments.
As an early member of the Ads Data team, help define its roadmap and technical culture, leveraging deep product intuition to shape what ads at Faire should be—not just how they’re built.
Work in a fast-paced, collaborative environment with team members who’ve shipped ML at top tech companies (e.g. Uber, Airbnb, Meta, Amazon, Pinterest).

Qualifications

2+ years of experience building and shipping ML systems in production, with meaningful experience in search, recommendation, or ads ranking/retrieval.
Hands-on experience with modern deep learning tooling (e.g., PyTorch) and familiarity with vector search / embedding-based retrieval concepts (e.g., Faiss, ScaNN, Pinecone).
A strong track record of productionizing models that blend LLMs (e.g., BERT / GPT-class) with structured features to drive relevance and personalization.
Product-focused mindset and a bias toward execution—moving quickly from research to measurable user and business impact.
Strong communication skills and the ability to work with others in a closely collaborative team environment

Great to Haves

Highly recommended: Master’s or PhD in Computer Science, Statistics, or related STEM fields
Ability to quickly implement state of the art algorithms from an academic paper

Salary Range

San Francisco: the pay range for this role is $165,500 to $227,500 per year.

This role will also be eligible for equity and benefits. Actual base pay will be determined based on permissible factors such as transferable skills, work experience, market demands, and primary work location. The base pay range provided is subject to change and may be modified in the future.

Hybrid Faire employees currently go into the office 3 days per week on Tuesdays, Thursdays, and a third flex day of their choosing (Monday, Wednesday, or Friday). Additionally, hybrid in-office roles will have the flexibility to work remotely up to 4 weeks per year. Specific Workplace and Information Technology positions may require onsite attendance 5 days per week as will be indicated in the job posting.

Why you’ll love working at Faire

Move fast: You'll own meaningful problems that serve customers around the globe with the agency to move fast and see your results clearly.
Equipped to scale: We invest in what matters, including the latest enterprise AI tools, to help you work smarter and get more out of every day.
Best in class: Our team is full of sharp, kind, and generous colleagues who care about their craft and about helping you grow in yours.
Real rewards. Competitive pay, equity, and comprehensive benefits designed to support your life inside and outside of work.
Belonging: We're intentional about building an environment where every Faire employee has equal access to opportunities, growth, and success.

Faire was founded in 2017 by a team of early product and engineering leads from Square. We’re backed by some of the top investors in retail and tech including: Y Combinator, Lightspeed Venture Partners, Forerunner Ventures, Khosla Ventures, Sequoia Capital, Founders Fund, and DST Global. We have headquarters in San Francisco and Kitchener-Waterloo, and a global employee presence across offices in Toronto, London, and New York. To learn more about Faire and our customers, you can read more on our blog.

Faire provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, sex, national origin, age, disability, genetics, sexual orientation, gender identity or gender expression.

Faire is committed to providing access, equal opportunity and reasonable accommodation for individuals with disabilities in employment, its services, programs, and activities. Accommodations are available throughout the recruitment process and applicants with a disability may request to be accommodated throughout the recruitment process. We will work with all applicants to accommodate their individual accessibility needs. To request reasonable accommodation, please fill out our Accommodation Request Form (https://bit.ly/faire-form)

Privacy

For information about the type of personal data Faire collects from applicants, as well as your choices regarding the data collected about you, please visit Faire’s Privacy Notice (https://www.faire.com/privacy)

Ready to apply?

Faire

View all jobs →

Senior Machine Learning Engineer, AI Platform

Affinity.co · Canada (Remote); Toronto, Ontario

Apply now

Engineering Canada USA Posted Mar 27, 2026

Affinity stitches together billions of data points from massive datasets to create a powerful, accurate representation of the world's professional relationship graph. Based on this data, we offer our users the insights and visibility they need to nurture and tap into the opportunities in their team's network.

This role is part of the AI Platform team, which owns the AI services that power Affinity's industry-leading relationship intelligence platform. We extract and retrieve information from billions of structured and unstructured data points to deliver actionable insights to customers.

As a Senior Machine Learning Engineer, you will collaborate with data engineers, software engineers, and product managers to shape the future of private capital's leading CRM platform. You will design and build AI systems that efficiently uncover insights from compelling business interaction data – an exciting and unique opportunity within the industry.

This is an applied machine learning position with a strong emphasis on engineering, rather than research. You will play a key role in advancing our ML Engineering capabilities, particularly in information retrieval and eventually recommendation systems.

What you’ll be doing:

Own the full ML lifecycle: Take projects from ideation to production, including feature engineering, model selection, deployment, and model observability and evaluation.
Translate business needs into ML solutions: Gather product requirements and translate them into robust ML system design requirements.
Build recommendation and ranking systems: Architect and launch ranking and recommendation infrastructure from scratch, initially via integrated off-the-shelf models, and evolving to targeted and customized solutions in the long term.
Solve complex problems: Work on a variety of information extraction, information storage and information retrieval problems for both structured and unstructured data.
Collaborate cross-functionally: Partner with cross-functional (product, infra, data engineering, and software engineering) teams to build robust, high-scale systems that underlie all of our data processing and ML Operations.

Qualifications

Don’t meet every single requirement? Studies have shown that women and people of color are less likely to apply to jobs unless they meet every qualification. At Affinity, we are dedicated to building a diverse, inclusive, and authentic workplace, so if you’re excited about this role, but your past experience doesn’t perfectly align with the qualifications above, we encourage you to apply anyways. You may be just the right candidate for this or other roles.

Required:

5+ years of experience in software engineering and/or Machine Learning experience in applying machine learning in production.
Hands-on experience developing ranking or recommendation systems from scratch, deployed at scale using techniques such as learn-to-rank, explainable recommendations
Strong understanding of machine learning techniques, including clustering and decision trees
Experience with serving ML models for streaming and batch inference at scale.
Experience with vector or graph databases.
Proficiency in Python and modern ML frameworks (PyTorch, Scikit-learn, or similar).
Track record of building maintainable, testable, and production-grade codebases.
Experience with observability tools for online and offline model evaluation, A/B testing, and tracing for AI applications.

Nice to Have:

Experience with dataset engineering, including data curation, augmentation, and synthesis, to assist ML model improvement.
Experience with graph-based recommendation systems, such as graph NN.
Experience with packaging, CI/CD and pipeline automation.

Tech stack: Our ML pipeline manages multiple Python services that support various AI features, including utilizing OCR to extract information from unstructured data, serving embedding models to vectorize chunks, and ranking a list of recommendations based on relevance and user preference.

How we work:

Our culture is a key part of how we operate, as well as our hiring process:

We iterate quickly. As such, you must be comfortable embracing ambiguity, be able to cut through it, and deliver value to our customers.
We are candid, transparent, and speak our minds while simultaneously caring personally with each person we interact with.
We make data-driven decisions and make the best decision for the moment based on the information available.

If you’d want to learn more about our values click here.

What you'll enjoy at Affinity:

We live our values: As owners, we take pride in everything we do. We embrace a growth mindset, engage in respectful candor, act as playmakers, and "taste the soup" by diving deep into experiences to create the best outcomes for our colleagues and clients.
Health Benefits: We cover both you and your dependents' extended health benefit premiums and offer flexible personal & sick days to support your well-being.
Retirement Planning: We offer an RRSP plan to help you plan for your future.
Learning & Development: We provide an annual education budget and a comprehensive L&D program.
Wellness Support: We reimburse monthly for things like home internet, meals, and wellness memberships/equipment to support your overall health and happiness.
Team Connection: Virtual team-building activities and socials to keep our team connected, because building strong relationships is key to success.

Please note that the role compensation details below reflect the base salary only and do not include any equity or benefits. This represents the salary range that Affinity believes, in good faith, at the time of this posting, that it will pay for the posted job.

A reasonable estimate of the current range is $160,000 to $220,000 CAD. Within the range, individual pay depends on various factors including geographical location and review of experience, knowledge, skills, abilities of the applicant.

At Affinity, we may use artificial intelligence (AI) tools as part of our recruitment process to help screen and evaluate candidate applications. While AI assists our hiring team in managing applications efficiently, it never replaces decisions made by real people. We are committed to fair and transparent hiring practices, and our AI tools are regularly monitored to ensure they support—not replace—human judgment.

Additional Information:

This job posting represents an active vacancy that we are currently seeking to fill.

About Affinity

With more than 3,000 customers worldwide and backed by some of Silicon Valley's best firms, Affinity has raised $120M to empower dealmakers to find, manage, and close more deals. How? Our Relationship Intelligence platform uses the wealth of data exhaust from trillions of interactions between Investment Bankers, Venture Capitalists, Consultants, and other strategic dealmakers to deliver automated relationship insights that drive over 450,000 deals every month. We are are proud to have received Inc. and Fortune Best Workplaces awards as well as to be Great Places to Work certified for the last 5 years running. Join us on our mission to make it possible for anyone to cultivate and fully harness their network to succeed.

We use E-Verify

Our company uses E-Verify to confirm the employment eligibility of all newly hired employees. To learn more about E-Verify, including your rights and responsibilities, please visit www.dhs.gov/E-Verify.

Ready to apply?

Apply to Affinity.co

Affinity.co

View all jobs →

Senior Applied AI/ML Scientist - Search

Faire · San Francisco, CA

Apply now

Algorithms & Data San Francisco, CA Kitchener-Waterloo, ON Toronto, ON Posted Mar 26, 2026

About Faire

Faire is a technology wholesale platform built on the belief that the future is local. Independent retailers around the globe collectively represent a multi-hundred-billion-dollar wholesale market that has historically been fragmented and offline. At Faire, we're using the power of tech, data, and machine learning to connect this thriving community of entrepreneurs across the globe. Picture your favorite boutique in town — we help them discover the best products from around the world to sell in their stores. With the right tools and insights, we believe that we can level the playing field so businesses can grow and local communities can thrive.

We’re looking for smart, resourceful and passionate people to join us as we power the shop local movement. If you believe in community, come join ours.

About this role

As a Senior Applied AI/ML Scientist on the Search ranking team, you will help shape the technical vision, machine-learning algorithm strategy, and system design behind one of our most important growth levers: Search (the primary tool used by customers on any e-commerce site). You will advance real-time search and recommendation systems that power next-generation shopping experiences.

You’ll work at the frontier of algorithms, combining query understanding, deep learning, transformer-based sequential modeling, graph neural networks, and structured behavioral data to return hyper-relevant, personalized products and brands for every user query.

This is a rare chance to influence the end-to-end personalized discovery experience at Faire within a high-scale, deeply multi-modal environment, while collaborating closely with a talented team of scientists and engineers.

What you'll do

Build our next-generation Search ranking algorithms by integrating the latest advances in deep learning and machine learning to personalize the retailer discovery journey at Faire
Leverage LLM to extract multimodal signals (text, visual) to better profile users and their intents.
Partner closely with teams across Faire to experiment and improve the ML models for search ranking and beyond.
Design and productionize natural-language search and discovery systems so that intelligent agents can generate relevant and personalized collections, explain search results, and assist retailers with browsing, filtering, and evaluation.
Share best practices regarding deep learning model development, agent-workflow evaluation, and MLOps, and help teammates level up through code reviews and technical guidance.

You're a great fit if you have...

5+ years of industry experience building large-scale ML models with business impact and shipping ML solutions to production, including 3+ years in search, recommendation, or ads ranking
A Master’s or PhD in Computer Science, Statistics, or a related STEM field.
Strong programming skills (Python, Java, or equivalent) and hands-on experience with deep-learning libraries (e.g., PyTorch) and big data technologies (e.g., Spark).
Deep understanding of machine learning best practices (e.g., training/serving, imbalanced data, A/B testing, feature engineering, and feature/model selection) and algorithms (e.g., user modeling, deep learning, and reinforcement learning) with applications in search, recommendation, and advertising domains.
A product-focused mindset and a bias toward execution—moving quickly from research papers to prototypes and production.
Excellent written and verbal communication skills and strong cross-functional influence that raise the technical bar beyond your immediate team.

Bonus points for...

Contributions to open-source ML libraries or peer-reviewed publications in ML/AI.
Industry experience developing and productizing LLM-based applications and systems in the search domain.
Industry experience building search and recommendation systems for e-commerce or two-sided marketplaces.
Experience using AI tools (e.g., Cursor, Claude Code, Codex) for code development and daily productivity.
Familiarity with Kotlin

Salary Range

California: the pay range for this role is $192,000 to $264,000 per year.

This role will also be eligible for equity and benefits. Actual base pay will be determined based on permissible factors such as transferable skills, work experience, market demands, and primary work location. The base pay range provided is subject to change and may be modified in the future.

Hybrid Faire employees currently go into the office 3 days per week on Tuesdays, Thursdays, and a third flex day of their choosing (Monday, Wednesday, or Friday). Additionally, hybrid in-office roles will have the flexibility to work remotely up to 4 weeks per year. Specific Workplace and Information Technology positions may require onsite attendance 5 days per week as will be indicated in the job posting.

Why you’ll love working at Faire

Move fast: You'll own meaningful problems that serve customers around the globe with the agency to move fast and see your results clearly.
Equipped to scale: We invest in what matters, including the latest enterprise AI tools, to help you work smarter and get more out of every day.
Best in class: Our team is full of sharp, kind, and generous colleagues who care about their craft and about helping you grow in yours.
Real rewards. Competitive pay, equity, and comprehensive benefits designed to support your life inside and outside of work.
Belonging: We're intentional about building an environment where every Faire employee has equal access to opportunities, growth, and success.

Faire was founded in 2017 by a team of early product and engineering leads from Square. We’re backed by some of the top investors in retail and tech including: Y Combinator, Lightspeed Venture Partners, Forerunner Ventures, Khosla Ventures, Sequoia Capital, Founders Fund, and DST Global. We have headquarters in San Francisco and Kitchener-Waterloo, and a global employee presence across offices in Toronto, London, and New York. To learn more about Faire and our customers, you can read more on our blog.

Faire provides equal employment opportunities (EEO) to all employees and applicants for employment without regard to race, color, religion, sex, national origin, age, disability, genetics, sexual orientation, gender identity or gender expression.

Faire is committed to providing access, equal opportunity and reasonable accommodation for individuals with disabilities in employment, its services, programs, and activities. Accommodations are available throughout the recruitment process and applicants with a disability may request to be accommodated throughout the recruitment process. We will work with all applicants to accommodate their individual accessibility needs. To request reasonable accommodation, please fill out our Accommodation Request Form (https://bit.ly/faire-form)

Privacy

For information about the type of personal data Faire collects from applicants, as well as your choices regarding the data collected about you, please visit Faire’s Privacy Notice (https://www.faire.com/privacy)

Ready to apply?

Faire

View all jobs →

Applied Research Scientist

Evolver · Toronto, Ontario or Palo Alto, California

Apply now

Lab Palo Alto Posted Mar 18, 2026

We are an innovative AI startup focused on transforming professional services through cutting-edge Generative AI and deep domain expertise. Our agent-driven solutions automate complex workflows, engaging humans only when needed to maximize efficiency and accuracy. Join us at the forefront of AI innovation, where your expertise will directly shape the future of professional services.

About the Role

We are looking for an experienced researcher with a passion for Generative AI and hands-on expertise in cloud development. As an Applied Research Scientist, you will lead the design, development, and deployment of scalable GenAI applications that redefine how professional services are delivered. You will work in small teams to own delivery of high-stakes projects with clients, much like a hands-on AI startup CTO. A day’s work may include building LLM workflows on a large scale, interacting with customers to understand their needs and set their AI strategy, but the most impact will be driven by implementing solutions into the real world of our partner's organizations. Your work will contribute to extending our existing footprint and strategy into new markets and problem spaces opened by Gen AI.

Proven track record of achieving significant results demonstrated in grants, fellowships, patents and publications in top-tier AI conferences (AAAI, ICLR, Neurips); NLP venues (ACL, EMNLP); or computer vision venues (CVPR, ICCV).

Responsibilities

• Innovation: Stay ahead of emerging trends in AI and GenAI to continuously elevate our technical capabilities.

• Publish: We encourage collaborations with universities, and inter-disciplinary research to ensure our work is open-access, and for our research to remain at the forefront of innovation.

• Mentorship: Guide and support junior researchers and engineers to foster technical growth within the team.

• GenAI Development: Support engineering teams for building a platform for enterprise GenAI solutions leveraging large language models (LLMs) and state-of-the-art frameworks.

• Cloud Development: Design and implement robust, cloud-native solutions, preferably on Azure.

• Model Optimization: Fine-tune LLMs, enhance data retrieval from structured/unstructured sources, and optimize prompt engineering and agent plans.

• Collaboration: Work closely with cross-functional teams, including domain experts and product managers, to deliver solutions aligned with business needs.

Qualifications

Education:

• Ph.D. in Computer Science, Engineering, Neuroscience, Robotics, or a related field.

Experience:

• 5+ years of professional experience in Python development, including deploying machine learning or Generative AI applications.

• 5+ years research experience related to next-generation technologies (multi-modal systems, neuro-symbolic AI, agents, reinforcement learning, domain adaptation, and alignment).

• 2+ years of hands-on experience building and deploying AI/ML solutions in Azure.

Technical Skills:

• Advanced proficiency in Python and libraries such as PyTorch, Tensorflow, JAX or similar.

• Experience with Cloud development, preferably Azure (e.g., Azure App Service, Azure Functions, Azure Database Services, Azure AI services, and Azure Search).

• Experience with LLMs from OpenAI, Anthropic, Google, or Meta and their application in chatbots, RAG systems, and agents.

• Familiarity with GenAI frameworks like LangChain, DSPy, or equivalent tools.

Soft Skills:

• Excellent communication and collaboration skills in team-driven environments.

• Analytical mindset with a strong focus on problem-solving.

• Comfortable working in a fast-paced, dynamic startup culture.

• Eagerness to learn and contribute to cutting-edge technologies.

Benefits

• Competitive Compensation: Tailored to your experience and skill set.

• Flexible Work Arrangements: Hybrid working model for work-life balance.

• Career Growth: Opportunities for professional development and leadership roles.

• Innovative Culture: Work on transformative technologies and make an impact in the AI space.

Ready to apply?

Apply to Evolver

Evolver

View all jobs →

Tech Lead, AI Compute Infrastructure

HeyGen · Los Angeles, Palo Alto, San Francisco, Toronto, Singapore

Apply now

Engineering Los Angeles San Francisco Palo Alto Toronto Posted Feb 11, 2026

About HeyGen

At HeyGen, our mission is to make visual storytelling accessible to all. Over the last decade, visual content has become the preferred method of information creation, consumption, and retention. But the ability to create such content, in particular videos, continues to be costly and challenging to scale. Our ambition is to build technology that equips more people with the power to reach, captivate, and inspire audiences.
Learn more at www.heygen.com. Visit our Mission and Culture doc here.

We are seeking a seasoned Technical Leader to build and scale the foundational compute infrastructure that powers our state-of-the-art AI models—from multimodal training data pipelines to high-throughput, low-latency video generation.

Responsibilities

You will be the core engineer responsible for building the robust, efficient, and scalable platform that enables our research and production teams to rapidly iterate on HeyGen's generative video models. Your contributions will directly impact model performance, developer productivity, and the final quality of every AI-generated video.

Optimize GPU Utilization: Design and implement mechanisms to aggressively optimize GPU and cluster utilization across thousands of devices for inference, training, data processing and large-scale deployment of our state-of-art video generation models.
Develop Large-Scale AI Job Framework: Build highly scalable, reliable frameworks for launching and managing massive, heterogeneous compute jobs, including multi-modal high-volume data ingestion/processing, distributed model training, and continuous evaluation/benchmarking.
Enhance Observability: Develop world-class observability, tracing, and visualization tools for our compute cluster to ensure reliability, diagnose performance bottlenecks (e.g., memory, bandwidth, communication).
Accelerate Pipelines: Collaborate closely with AI researchers and AI engineers to integrate innovative acceleration techniques (e.g., custom CUDA kernels, distributed training libraries) into production-ready, scalable training and inference pipelines.
Infrastructure Management: Champion the adoption and optimization of modern cloud and container technologies (Kubernetes, Ray) for elastic, cost-efficient scaling of our distributed systems.

Minimum Requirements

We are looking for a highly motivated engineer with deep experience operating and optimizing AI infrastructure at scale.

Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
5+ years of full-time industry experience in large-scale MLOps, AI infrastructure, or HPC systems.
Experience with data frameworks and standards like Ray, Apache Spark, LanceDB
Strong proficiency in Python and a high-performance language such as C++ for developing core infrastructure components.
Deep understanding and hands-on experience with modern orchestration and distributed computing frameworks such as Kubernetes and Ray.
Experience with core ML frameworks such as PyTorch, TensorFlow, or JAX.

Preferred Qualifications

Master's or PhD in Computer Science or a related technical field.
Demonstrated Tech Lead experience, driving projects from conceptual design through to production deployment across cross-functional teams.
Prior experience building infrastructure specifically for Generative AI models (e.g., diffusion models, GANs, or large language models) where cost and latency are critical.
Proven background in building and operating large-scale data infrastructure (e.g., Ray, Apache Spark) to manage petabytes of multi-modal data (video, audio, text).
Expertise in GPU acceleration and deep familiarity with low-level compute programming, including CUDA, NCCL, or similar technologies for efficient inter-GPU communication.

What HeyGen Offers

Competitive salary and benefits package.
Dynamic and inclusive work environment.
Opportunities for professional growth and advancement.
Collaborative culture that values innovation and creativity.
Access to the latest technologies and tools.

HeyGen is an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

Ready to apply?

HeyGen

View all jobs →

Software Engineer, AI Compute Infrastructure

HeyGen · Los Angeles, Palo Alto, San Francisco, Toronto, Singapore

Apply now

Engineering Los Angeles San Francisco Palo Alto Toronto Posted Feb 11, 2026

About HeyGen

At HeyGen, our mission is to make visual storytelling accessible to all. Over the last decade, visual content has become the preferred method of information creation, consumption, and retention. But the ability to create such content, in particular videos, continues to be costly and challenging to scale. Our ambition is to build technology that equips more people with the power to reach, captivate, and inspire audiences.
Learn more at www.heygen.com. Visit our Mission and Culture doc here.

We are seeking a seasoned Software Engineer to build and scale the foundational compute infrastructure that powers our state-of-the-art AI models—from multimodal training data pipelines to high-throughput, low-latency video generation.

Responsibilities

You will be the core engineer responsible for building the robust, efficient, and scalable platform that enables our research and production teams to rapidly iterate on HeyGen's generative video models. Your contributions will directly impact model performance, developer productivity, and the final quality of every AI-generated video.

Optimize GPU Utilization: Design and implement mechanisms to aggressively optimize GPU and cluster utilization across thousands of devices for inference, training, data processing and large-scale deployment of our state-of-art video generation models.
Develop Large-Scale AI Job Framework: Build highly scalable, reliable frameworks for launching and managing massive, heterogeneous compute jobs, including multi-modal high-volume data ingestion/processing, distributed model training, and continuous evaluation/benchmarking.
Enhance Observability: Develop world-class observability, tracing, and visualization tools for our compute cluster to ensure reliability, diagnose performance bottlenecks (e.g., memory, bandwidth, communication).
Accelerate Pipelines: Collaborate closely with AI researchers and AI engineers to integrate innovative acceleration techniques (e.g., custom CUDA kernels, distributed training libraries) into production-ready, scalable training and inference pipelines.
Infrastructure Management: Champion the adoption and optimization of modern cloud and container technologies (Kubernetes, Ray) for elastic, cost-efficient scaling of our distributed systems.

Minimum Requirements

We are looking for a highly motivated engineer with deep experience operating and optimizing AI infrastructure at scale.

Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience.
5+ years of full-time industry experience in large-scale MLOps, AI infrastructure, or HPC systems.
Experience with data frameworks and standards like Ray, Apache Spark, LanceDB
Strong proficiency in Python and a high-performance language such as C++ for developing core infrastructure components.
Deep understanding and hands-on experience with modern orchestration and distributed computing frameworks such as Kubernetes and Ray.
Experience with core ML frameworks such as PyTorch, TensorFlow, or JAX.

Preferred Qualifications

Master's or PhD in Computer Science or a related technical field.
Demonstrated Tech Lead experience, driving projects from conceptual design through to production deployment across cross-functional teams.
Prior experience building infrastructure specifically for Generative AI models (e.g., diffusion models, GANs, or large language models) where cost and latency are critical.
Proven background in building and operating large-scale data infrastructure (e.g., Ray, Apache Spark) to manage petabytes of multi-modal data (video, audio, text).
Expertise in GPU acceleration and deep familiarity with low-level compute programming, including CUDA, NCCL, or similar technologies for efficient inter-GPU communication.

What HeyGen Offers

Competitive salary and benefits package.
Dynamic and inclusive work environment.
Opportunities for professional growth and advancement.
Collaborative culture that values innovation and creativity.
Access to the latest technologies and tools.

HeyGen is an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

Ready to apply?

HeyGen

View all jobs →

Data Infrastructure Engineer

HeyGen · Los Angeles, Palo Alto, San Francisco, Toronto

Apply now

Engineering Los Angeles San Francisco Palo Alto Toronto Posted Feb 11, 2026

About HeyGen

At HeyGen, our mission is to make visual storytelling accessible to all. Over the last decade, visual content has become the preferred method of information creation, consumption, and retention. But the ability to create such content, in particular videos, continues to be costly and challenging to scale. Our ambition is to build technology that equips more people with the power to reach, captivate, and inspire audiences.
Learn more at www.heygen.com. Visit our Mission and Culture doc here.

Position Summary:

At HeyGen, we are at the forefront of developing applications powered by our cutting-edge AI research. As a Data Infrastructure Engineer, you will lead the development of fundamental data systems and infrastructure. These systems are essential for powering our innovative applications, including Avatar IV, Photo Avatar, Instant Avatar, Interactive Avatar, and Video Translation. Your role will be crucial in enhancing the efficiency and scalability of these systems, which are vital to HeyGen's success.

Key Responsibilities:

Design, build, and maintain the data infrastructure and systems needed to support our AI applications. Examples include

Large scale data acquisition
Multi-modal data processing framework and applications
Storage and computation efficiency
AI model evaluation and productionization infrastructure

Collaborate with data scientists and machine learning engineers to understand their computational and data needs and provide efficient solutions.
Stay up-to-date with the latest industry trends in data infrastructure technologies and advocate for best practices and continuous improvement.
Assist in budget planning and management of cloud resources and other infrastructure expenses.

Qualifications:

Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field
Proven experience in managing infrastructure for large-scale AI or machine learning projects
Excellent problem-solving skills and the ability to work independently or as part of a team.
Proficiency in Python
Experience optimizing computational workflows
Familiarity with AI and machine learning frameworks like TensorFlow or PyTorch.

Preferred Qualifications:

Experience with GPU computing
Experience with distributed data processing system
Experience building large scale batch inference system
Prior experience in a startup or fast-paced tech environment.

What HeyGen Offers

Competitive salary and benefits package.
Dynamic and inclusive work environment.
Opportunities for professional growth and advancement.
Collaborative culture that values innovation and creativity.
Access to the latest technologies and tools.

HeyGen is an Equal Opportunity Employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

Ready to apply?

HeyGen

View all jobs →

Research Engineer

HeyGen · Los Angeles, Toronto, Singapore

Apply now

AI Research & Development Los Angeles Posted Jan 15, 2026

About HeyGen

At HeyGen, our mission is to make visual storytelling accessible to all. Over the last decade, visual content has become the preferred method of information creation, consumption, and retention. But the ability to create such content, in particular videos, continues to be costly and challenging to scale. Our ambition is to build technology that equips more people with the power to reach, captivate, and inspire audiences.
Learn more at www.heygen.com. Visit our Mission and Culture doc here.

Driving Product Innovation with Advanced Computer Vision Technology

HeyGen is a dynamic startup at the forefront of revolutionizing video content with state-of-the-art artificial intelligence technology. With a robust track record of growth and a proven business model, we are rapidly carving out a leadership position in the AI-powered video creation industry. We are in search of passionate Research Scientists with a specialization in Computer Vision to propel our mission forward. If you are eager to deploy your advanced skills to disrupt and redefine the video creation and interaction experience, your next big opportunity awaits with us.

Key Responsibilities:

Applied Innovation and Product Development: Harness your knowledge of generative models, including GANs, Diffusion models, and other multi-modal models, to propel our video creation tools forward. Your role involves translating state-of-the-art research into innovative features that directly enhance our user experience and set our products apart.
Real-World Solutions: Channel your research capabilities to develop scalable solutions that have a significant impact on our platform, making the video creation process more intuitive, powerful, and accessible to our users.
Collaborative Engineering and Growth: Engage with a diverse team where collaborative efforts lead to extraordinary outcomes. Embrace an environment that promotes learning, mentorship, and teamwork, furthering both personal development and the advancement of our technology.

Qualifications:

A Master's or PhD in Computer Science, Engineering, or a related field
1-3 years of experience in machine learning and computer vision, demonstrating practical application of generative models and a passion for product-centric innovation.
Proficiency in Python, with openness to other programming languages and tools as required by project demands.
Experience with deep learning frameworks such as PyTorch, and a capacity for quickly mastering new technologies.
A collaborative spirit with a proven track record in a team-based research or development setting.
A lifelong learner with outstanding problem-solving skills and the ability to communicate effectively in a diverse team environment.

What HeyGen Offers:

Competitive Salary and Benefits: We offer a salary commensurate with your experience and a comprehensive benefits package that addresses a variety of needs.
Innovative and Inclusive Work Environment: Our workplace is designed to inspire innovation through collaboration, respect, and mutual support.
Opportunities for Advancement: We are committed to your professional development, offering extensive opportunities for learning, mentorship, and career growth.
Dedication to Diversity: HeyGen is an equal opportunity employer, championing a diverse and inclusive team that ensures equal access to opportunities for all members.

Ready to apply?