We use cookies. Find out more about it here. By continuing to browse this site you are agreeing to our use of cookies.
#alert
Back to search results
New

Staff Software System Design Engineer- AI Compiler

Advanced Micro Devices, Inc.
$134,400.00/Yr.-$201,600.00/Yr.
United States, Texas, Austin
7171 Southwest Parkway (Show on map)
Oct 29, 2025


WHAT YOU DO AT AMD CHANGES EVERYTHING

At AMD, our mission is to build great products that accelerate next-generation computing experiences-from AI and data centers, to PCs, gaming and embedded systems. Grounded in a culture of innovation and collaboration, we believe real progress comes from bold ideas, human ingenuity and a shared passion to create something extraordinary. When you join AMD, you'll discover the real differentiator is our culture. We push the limits of innovation to solve the world's most important challenges-striving for execution excellence, while being direct, humble, collaborative, and inclusive of diverse perspectives. Join us as we shape the future of AI and beyond. Together, we advance your career.

THE ROLE:

We are looking for a dynamic, energetic Lead Compiler Engineer to join our growing team in AI group. As a part of this role, you will be responsible for designing, developing, and optimizing frontend compiler for latest neural networks on AMD's XDNA Neural Processing Units that power cutting edge generative AI models like Stable diffusion, SDXL-Turbo, Llama2, etc. Your work will directly impact the efficiency, scalability, and reliability of our ML applications. If you thrive in a fast-paced environment and love working on cutting edge machine learning inference applications, this role is for you.

KEY RESPONSIBILITIES:

  • Design and implement NPU compiler framework for neural networks.
  • Develop hardware aware graph optimizations for high level ML frameworks like ONNX.
  • Research new algorithms for operator scheduling for efficient inference of latest NN models.
  • Interface with ONNX / Pytorch runtime and lower level HW implementation.
  • Contribute to high performance inference for GenAI workloads such as Llama2-7B, Stable diffusion, SDXL-Turbo etc.
  • Work closely with kernel developers, performance architects, and AI researchers
  • Manage CPU, and memory resources effectively during model execution.
  • Handle resource allocation for ML deployments across different tenants.
  • Research heterogenous mapping of ML operators for maximum efficiency.
  • Build tools to track resource utilization, bottlenecks, and anomalies.
  • Enable detailed profiling and debugging tools for analyzing ML workload latency.
  • Implement rigorous code review practices for superior code quality assurance.
  • Adopt incremental development methodologies for tackling complex projects effectively.
  • Foster cross-functional collaboration to address intricate challenges and drive innovation.

QUALIFICATIONS:

  • Strong programming skills in C++, Python.
  • Experience with proprietary/open source compiler stack: TVM, MLIR.
  • Experience with ML frameworks (e.g., ONNX, PyTorch) is required.
  • Experience with ML models such as CNN, LSTM, LLMs, Diffusion is a must.
  • Experience with ONNX, Pytorch runtime integration is a bonus.
  • Excellent problem-solving abilities and a passion for performance optimization.

ACADEMIC CREDENTIALS:

Master's, or PhD degree in Computer Science, Electrical Engineering, or related fields.

#LI-TC1

#LI-HYBRID

Benefits offered are described: AMD benefits at a glance.

AMD does not accept unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services. AMD and its subsidiaries are equal opportunity, inclusive employers and will consider all applicants without regard to age, ancestry, color, marital status, medical condition, mental or physical disability, national origin, race, religion, political and/or third-party affiliation, sex, pregnancy, sexual orientation, gender identity, military or veteran status, or any other characteristic protected by law. We encourage applications from all qualified candidates and will accommodate applicants' needs under the respective laws throughout all stages of the recruitment and selection process.

Applied = 0

(web-675dddd98f-4tmch)