AI Inference Engineer Intern - Model Pruning

Quadric, Inc· Software Engineering
Apply Now ↗
📍 Burlingame, California, United StatesTemporary💰 USD 45+

About this role

Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadric's co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices, ranging from battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems. Unlike other NPUs or neural network accelerators in the industry today that can only accelerate a portion of a machine learning graph, the Quadric GPNPU executes both NN graph code and conventional C++ DSP and control code.

Note: Our preference is for this internship to be based out of our Burlingame, California office. Candidates should be based in the Bay Area or able to relocate for the internship period and available to work on site.

Responsibilities:
Model pruning: Prune the model to speed up inference with re-training to maintain accuracy.

  • MS student in CS or related fields.
  • Proficiency in Python
  • Experience with model pruning and training in PyTorch
  • Experience in quantization, and vision model accuracy metrics.

At Quadric, we value Integrity, Humility, and Happiness. What we expect from one another is simple and clear: Initiative, Collaboration, and Completion. We are a collaborative team focused on building something extraordinary in the edge computing space. 

The hourly rate for this temporary internship position is $45.00/hour to $60.00/hour. The actual rate offered will depend on a number of factors, including the specific level of the role, years and depth of relevant experience and education, technical skills and competencies, and work location. 

Quadric interns receive hands-on experience working alongside industry experts in AI and semiconductor technology, with access to mentorship and meaningful project ownership from day one.

Founded in 2016 and based in downtown Burlingame, California, Quadric is building the world’s first supercomputer designed for the real-time needs of edge devices. Quadric aims to empower developers in every industry with superpowers to create tomorrow’s technology, today. The company was co-founded by technologists from MIT and Carnegie Mellon, who were previously the technical co-founders of the Bitcoin computing company 21.

Quadric is proud to be an equal opportunity employer. We are committed to creating an inclusive environment where people from all backgrounds can do their best work. We consider all qualified applicants without regard to race, color, religion, sex, gender identity or expression, sexual orientation, national origin, age, disability, veteran status, or any other protected characteristic under applicable law.

If this role resonates with you, we encourage you to apply even if your experience does not perfectly match every qualification. We value potential, curiosity, and a willingness to learn just as much as direct experience. Skills and growth come in many forms, and we would love to hear your story.

By submitting an application, you acknowledge that Quadric will collect and process your personal information as part of the hiring process. Please review our Privacy Policy to understand how we handle your data.

Frequently Asked Questions

What is the salary for the AI Inference Engineer Intern - Model Pruning role at Quadric, Inc?
The listed salary for this AI Inference Engineer Intern - Model Pruning position at Quadric, Inc is USD 45+. This is an Temporary role.
Where is the AI Inference Engineer Intern - Model Pruning position at Quadric, Inc located?
This AI Inference Engineer Intern - Model Pruning role at Quadric, Inc is based in Burlingame, California, United States. The position is listed as on-site or hybrid. Check the full job description or apply directly to confirm the work arrangement.
Is the AI Inference Engineer Intern - Model Pruning role at Quadric, Inc full-time or part-time?
This is listed as a Temporary position. It is posted as a AI Inference Engineer Intern - Model Pruning role in the Software Engineering department at Quadric, Inc.
Which team or department does the AI Inference Engineer Intern - Model Pruning at Quadric, Inc belong to?
This AI Inference Engineer Intern - Model Pruning position is part of the Software Engineering department at Quadric, Inc. See the full job description for more information about the team structure and responsibilities.
How do I apply for the AI Inference Engineer Intern - Model Pruning position at Quadric, Inc?
Click the "Apply Now" button on this page. You will be redirected to Quadric, Inc's official application portal hosted on workable where you can submit your application directly.
When was the AI Inference Engineer Intern - Model Pruning job at Quadric, Inc posted?
This AI Inference Engineer Intern - Model Pruning position at Quadric, Inc was posted on May 22, 2026. Apply as soon as possible — early applications are often reviewed first.
AI Inference Engineer Intern - Model Pruning
Quadric, Inc · 💰 USD 45+
Apply for this role ↗

You'll be redirected to Quadric, Inc's official application page on workable.