Pick a job to read the details

Tap any role on the left — its description and apply link will open here.

EN

AI Compiler Engineer

EnCharge AI · U.S., Canada, Germany, Norway, India

Software Canada Germany Norway Remote - US Posted Apr 17, 2026

EnCharge AI is a leader in advanced AI hardware and software systems for edge-to-cloud computing. EnCharge’s robust and scalable next-generation in-memory computing technology provides orders-of-magnitude higher compute efficiency and density compared to today’s best-in-class solutions. The high-performance architecture is coupled with seamless software integration and will enable the immense potential of AI to be accessible in power, energy, and space constrained applications. EnCharge AI launched in 2022 and is led by veteran technologists with backgrounds in semiconductor design and AI systems.

About the Role

EnCharge AI is seeking a highly skilled and experienced AI Compiler Engineer to spearhead the efforts in developing and optimizing graph compilers tailored to cutting-edge AI and ML workloads. You will collaborate with hardware architects, and AI researchers to enhance performance, optimize computation graphs, and enable efficient model deployment on EnCharge’s Inference Accelerators.

Responsibilities

Architect, design, and implement optimizations for AI model execution on graph compilers to improve performance, reduce latency, and maximize hardware utilization.
Work closely with ML researchers, hardware engineers, and software developers to design and deploy AI models, understanding and addressing hardware-specific challenges.
Work on performance optimizations for neural network models, such as layer fusion, operator fusion, and graph-level transformations.
Develop compiler optimizations and passes that convert high-level AI models (e.g., from TensorFlow, PyTorch) into intermediate representations (IR).
Implement parsing, semantic analysis, and IR generation for deep learning frameworks.
Research and integrate the latest advancements in compiler design, ML model optimizations, and hardware acceleration into graph compilers.
Provide leadership, mentorship, and technical guidance to a team of engineers focused on graph compiler optimizations.

Qualifications

Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, or related field (Ph.D. preferred).
3+ years in compiler development, with a strong focus on AI or ML graph compilers.
Proficiency in AI graph compiler frameworks (e.g., MLIR, Torch-FX)
Solid background in hardware architectures (e.g., GPUs, TPUs, ASICs) and optimization techniques such as fusion, quantization, and tiling.
Familiarity with neural networks operators and code generation.
Strong understanding of intermediate representations, code parsing, and semantic analysis in compiler design.
Proficiency in C++, Python, or other programming languages commonly used in compiler development.
Open-source contributions to AI software frameworks and libraries is a plus
Demonstrated experience leading and mentoring engineering teams with successful project delivery.

EnchargeAI is an equal employment opportunity employer in the United States.

Ready to apply?

Apply to EnCharge AI

EN

EnCharge AI

View all jobs →

EN

AI Software Engineer

EnCharge AI · Bengaluru, Karnataka (or Remote-friendly with travel)

Apply now

Software India Posted Mar 24, 2026

About EnCharge AI:

EnCharge AI is building the next generation AI platform. Our novel in-memory-computing architecture delivers a 10x step-function improvement in compute energy efficiency and performance for AI inference workloads. As the demands of artificial intelligence move beyond today's models, we believe fundamental underlying infrastructure must evolve. We are an experienced team of AI researchers, silicon & systems engineers, and architects backed by leading investors, poised to become the essential platform for the next wave of AI innovation.

The Opportunity:

Video generation represents one of the most compute-intensive frontiers in AI—and one of the most promising applications for our hardware's energy efficiency advantages. We're building a vertically-integrated video generation stack that will showcase the transformative potential of our silicon while delivering real value to customers today.

We are seeking a Software Engineer to build the infrastructure and applications that bring our video generation capabilities to market. You'll build the serving stack, customer-facing APIs, and develop agentic systems that demonstrate what's possible when video generation meets energy-efficient hardware.

This is a foundational role. You won't be inheriting a mature codebase—you'll be architecting production systems from scratch, making critical technical decisions, and building software that directly enables our go-to-market motion.

Key Responsibilities:

Inference & Serving Infrastructure: Design and build scalable serving infrastructure for video generation models (Wan, LTX-Video, Flux, and beyond). Own latency, throughput, reliability, and cost optimization.

API & Platform Development: Build robust APIs and SDKs that enable customers and partners to integrate video generation into their products. Design abstractions that balance flexibility with ease-of-use.

Video Applications & Demos: Develop compelling demo applications that showcase our platform's capabilities—interactive experiences, batch processing workflows, and vertical-specific solutions that support the GTM team.

Agentic Workflows: Build agentic systems that leverage video generation as a core capability—autonomous video editing pipelines, multi-step generation workflows, and tool-use patterns that extend what's possible with video AI.

Production Operations: Establish monitoring, observability, and deployment practices for video generation workloads. Ensure systems are reliable, debuggable, and ready for customer-facing use.

Qualifications:

5+ years of software engineering experience, with a focus on backend systems or ML infrastructure

Strong fundamentals in Python and at least one systems language (Go, Rust, C++)

Experience building and operating production APIs and serving systems

Familiarity with ML inference pipelines (model serving, GPU workloads, batching strategies)

Comfort working in fast-moving, ambiguous environments where you define the roadmap

Strong product instincts—you think about how software will be used, not just how it works

Nice to Have:

Experience with video processing pipelines or media infrastructure

Background building agentic systems or multi-step AI workflows

Familiarity with diffusion models or generative AI systems

Experience with Kubernetes, Ray, or other orchestration frameworks

Track record shipping developer tools or APIs

Ready to apply?

Apply to EnCharge AI

EN

EnCharge AI

View all jobs →

EN

AI Research Engineer

EnCharge AI · Canada, Germany, Norway, United States

Apply now

Software Canada Germany Norway Remote - US Posted Mar 18, 2026

EnCharge AI is a leader in advanced AI hardware and software systems for edge-to-cloud computing. EnCharge’s robust and scalable next-generation in-memory computing technology provides orders-of-magnitude higher compute efficiency and density compared to today’s best-in-class solutions. The high-performance architecture is coupled with seamless software integration and will enable the immense potential of AI to be accessible in power, energy, and space constrained applications. EnCharge AI launched in 2022 and is led by veteran technologists with backgrounds in semiconductor design and AI systems.

About the Role

EnCharge AI is looking for an experienced AI Research Engineer to optimize deep learning models for deployment on edge AI platforms. You will work on model compression, quantization strategies, and efficient inference techniques to improve the performance of AI workloads.

Responsibilities

Research and develop quantization-aware training (QAT) and post-training quantization (PTQ) techniques for deep learning models.
Implement low-bit precision optimizations (e.g., INT8, BF16).
Design and optimize efficient inference algorithms for AI workloads, focusing on latency, memory footprint, and power efficiency.
Work with frameworks such as PyTorch, ONNX Runtime, and TVM to deploy optimized models.
Analyze accuracy trade-offs and develop calibration techniques to mitigate precision loss in quantized models.
Collaborate with hardware engineers to optimize model execution for edge devices, and NPUs.
Contribute to research on knowledge distillation, sparsity, pruning, and model compression techniques.
Benchmark performance across different hardware and software stacks.
Stay updated with the latest advancements in AI efficiency, model compression, and hardware acceleration.

Qualifications

Master’s or Ph.D. in Computer Science, Electrical Engineering, or a related field.
Strong expertise in deep learning, model optimization, and numerical precision analysis.
Hands-on experience with model quantization techniques (QAT, PTQ, mixed precision).
Proficiency in Python, C++, CUDA, or OpenCL for performance optimization.
Experience with AI frameworks: PyTorch, TensorFlow, ONNX Runtime, TVM, TensorRT, or OpenVINO.
Understanding of low-level hardware acceleration (e.g., SIMD, AVX, Tensor Cores, VNNI).
Familiarity with compiler optimizations for ML workloads (e.g., XLA, MLIR, LLVM).

EnchargeAI is an equal employment opportunity employer in the United States.

Ready to apply?

Apply to EnCharge AI

EN

EnCharge AI

View all jobs →

EN

Device Driver Engineer

EnCharge AI · U.S., Canada, Germany, Norway

Apply now

Software Canada Germany Norway Remote - US Posted Jan 27, 2026

EnCharge AI is a leader in advanced AI hardware and software systems for edge-to-cloud computing. EnCharge’s robust and scalable next-generation in-memory computing technology provides orders-of-magnitude higher compute efficiency and density compared to today’s best-in-class solutions. The high-performance architecture is coupled with seamless software integration and will enable the immense potential of AI to be accessible in power, energy, and space constrained applications. EnCharge AI launched in 2022 and is led by veteran technologists with backgrounds in semiconductor design and AI systems.

About the Role

EnCharge AI is seeking a highly skilled Device Driver Engineer to design and implement high-performance driver stack for our cutting-edge AI accelerator hardware. In this role, you will work closely with hardware, firmware, and AI software teams to develop low-latency, high-bandwidth communication between the host system and AI accelerator.

Responsibilities

Develop, optimize, and maintain Linux/Windows PCIe device drivers for AI accelerators.
Implement low-level hardware interactions, DMA, memory management, and interrupt handling.
Work on driver optimizations to reduce latency and improve throughput for AI workloads.
Debug and troubleshoot PCIe protocol, kernel panics, crashes, and performance bottlenecks.
Collaborate with hardware, firmware, and AI software teams to define driver interfaces.
Ensure compliance with PCIe standards (Gen4/Gen5), SR-IOV, BAR memory mapping, and IOMMU.
Support virtualization (VFIO, SR-IOV, DPUs) and containerized environments (Kubernetes, Docker, etc.).
Develop tools for profiling, debugging, and monitoring driver performance.
Contribute to open-source kernel modules if applicable.

Qualifications

Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, or a related field.
3+ years of experience in device driver development for Linux and/or Windows.
Strong experience with PCIe-based hardware, including BAR regions, DMA, interrupts, and MMIO.
Proficiency in C/C++ and kernel-mode programming (Linux Kernel, Windows WDDM/WDF/MCDM).
Experience with AI-specific accelerators (e.g., GPUs, NPUs, TPUs) is a plus.
Knowledge of low-level debugging tools (gdb, perf, ftrace, dmesg, PCIe analyzers).
Understanding of multi-threading, synchronization, and memory management in kernel space.
Familiarity with high-performance AI/ML workloads is a plus.
Experience in hypervisor interactions, VFIO, and passthrough solutions.
Knowledge of secure boot, firmware updates, and trusted execution environments (TEE).

EnchargeAI is an equal employment opportunity employer in the United States.

Ready to apply?

Apply to EnCharge AI

EN

EnCharge AI

View all jobs →

EN

AI Runtime Engineer

EnCharge AI · U.S., Canada, Germany, Norway

Apply now

Software Canada Germany Norway Remote - US Posted Jan 27, 2026

EnCharge AI is a leader in advanced AI hardware and software systems for edge-to-cloud computing. EnCharge’s robust and scalable next-generation in-memory computing technology provides orders-of-magnitude higher compute efficiency and density compared to today’s best-in-class solutions. The high-performance architecture is coupled with seamless software integration and will enable the immense potential of AI to be accessible in power, energy, and space constrained applications. EnCharge AI launched in 2022 and is led by veteran technologists with backgrounds in semiconductor design and AI systems.

About the Role

EnCharge AI is seeking an AI Runtime Engineer to develop and optimize the execution stack for our next-generation AI accelerator. In this role, you will work on low-latency, high-performance runtime software that enables efficient execution of deep learning models on specialized hardware. You will collaborate with hardware, compiler, and AI framework teams to deliver optimized AI inference and training performance across cloud and edge environments.

Responsibilities

Develop and optimize the AI runtime software stack for executing deep learning workloads on AI accelerators.
Implement task scheduling, memory management, and kernel execution strategies for efficient computation.
Optimize data movement between host and device using PCIe, DMA, shared memory.
Design and implement high-performance APIs for AI Inference frameworks such as OpenVino, ONNX Runtime, vLLM
Work on graph execution optimizations, including kernel fusion, pipelining, tensor tiling, and caching.
Integrate runtime components with AI compilers (LLVM, MLIR, XLA, TVM) for optimized execution.
Ensure scalability and reliability of the AI runtime for cloud-based and edge AI deployments.

Qualifications

Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, or a related field.
3+ years of experience in developing low-level runtime software for AI accelerators, GPUs, or HPC systems.
Strong proficiency in C/C++ and low-level systems programming.
Deep understanding of task scheduling, concurrency, and memory hierarchy.
Experience with hardware-aware optimizations and dataflow architectures.
Familiarity with deep learning execution frameworks (ONNX Runtime, TensorRT, TVM, OpenVINO).
Experience with low-latency, high-throughput workload execution for AI models.
Strong debugging and profiling skills for optimizing AI execution performance.
Exposure to AI model deployment pipelines (Triton, TensorFlow Serving).

EnchargeAI is an equal employment opportunity employer in the United States.

Ready to apply?

Apply to EnCharge AI

EN

EnCharge AI

View all jobs →

EN

LLM Inference Deployment Engineer

EnCharge AI · U.S., Canada, Germany, Norway

Apply now

Software Canada Germany Norway Remote - US Posted Jan 27, 2026

EnCharge AI is a leader in advanced AI hardware and software systems for edge-to-cloud computing. EnCharge’s robust and scalable next-generation in-memory computing technology provides orders-of-magnitude higher compute efficiency and density compared to today’s best-in-class solutions. The high-performance architecture is coupled with seamless software integration and will enable the immense potential of AI to be accessible in power, energy, and space constrained applications. EnCharge AI launched in 2022 and is led by veteran technologists with backgrounds in semiconductor design and AI systems.

About the Role

EnCharge AI is seeking an LLM Inference Deployment Engineer to optimize, deploy, and scale large language models (LLMs) for high-performance inference on its energy efficient AI accelerators. You will work at the intersection of AI frameworks, model optimization, and runtime execution to ensure efficient model execution and low-latency AI inference.

Responsibilities

Deploy and optimize LLMs (GPT, LLaMA, Mistral, Falcon, etc.) post-training from libraries like HuggingFace
Utilize inference runtimes such as ONNX Runtime, vLLM for efficient execution.
Optimize batching, caching, and tensor parallelism to improve LLM scalability in real-time applications.
Develop and maintain high-performance inference pipelines using Docker, Kubernetes, and other inference servers.

Qualifications

Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, or related field.
Experience in LLM inference deployment, model optimization, and runtime engineering.
Strong expertise in LLM inference frameworks (PyTorch, ONNX Runtime, vLLM, TensorRT-LLM, DeepSpeed).
In-depth knowledge of the Python programming language for model integration and performance tuning.
Strong understanding of high-level model representations and experience implementing framework-level optimizations for Generative AI use cases
Experience with containerized AI deployments (Docker, Kubernetes, Triton Inference Server, TensorFlow Serving, TorchServe).
Strong knowledge of LLM memory optimization strategies for long-context applications.
Experience with real-time LLM applications (chatbots, code generation, retrieval-augmented generation).

EnchargeAI is an equal employment opportunity employer in the United States.

Ready to apply?

Apply to EnCharge AI

EN

EnCharge AI

View all jobs →

EN

Embedded SW Engineer

EnCharge AI · U.S., Canada, Germany, Norway

Apply now

Software Canada Germany Norway Remote - US Posted Jan 27, 2026

EnCharge AI is a leader in advanced AI hardware and software systems for edge-to-cloud computing. EnCharge’s robust and scalable next-generation in-memory computing technology provides orders-of-magnitude higher compute efficiency and density compared to today’s best-in-class solutions. The high-performance architecture is coupled with seamless software integration and will enable the immense potential of AI to be accessible in power, energy, and space constrained applications. EnCharge AI launched in 2022 and is led by veteran technologists with backgrounds in semiconductor design and AI systems.

About the Role

EnCharge AI is looking for an Embedded SW Engineer to develop the firmware for our Edge AI processors. The candidate must possess an excellent understanding of computer architecture and operating system concepts including, but not limited to, memory management, virtualization and PCIe address space. The role includes designing and developing the core Firmware for various parts of the SOC. The candidate must possess strong communication skills to interface with Runtime, Architecture and H/W teams.

Responsibilities

Develop the critical pieces of EAI Firmware used to deploy inference jobs on EAI processors
Validate different IP blocks on the SOC
Evaluate and integrate third-party device drivers to interface with EnCharge’s SW stack
Work closely with the Runtime, Hardware and Architecture teams define the driver architecture

Qualifications

Bachelors in EE/CS
Advanced programming skills in C/C++ for operating system kernel & systems development
Understanding of RISC-V architecture is a plus
Exposure to PCIe BAR and IOMMU architecture
Exposure to virtualization and hypervisor technologies
Deep understanding of operating systems concepts, data structures, x86-64 and accelerator architectures
Experience with low-level debug tools as well as emulators and simulators
Experience running, analyzing, and tuning system performance benchmarks
Excellent verbal and written communication skills

EnchargeAI is an equal employment opportunity employer in the United States.

Ready to apply?

Apply to EnCharge AI

EN

EnCharge AI

View all jobs →