This internship offers hands-on experience in model optimization for a unique general-purpose neural processing unit architecture, focusing on quantization and numerical accuracy testing.
Responsibilities include contributing to quantization workflows for vision and language models, building calibration datasets, and developing debugging tools.
Requirements are current or recent students in CS, EE, or related fields with strong Python skills, familiarity with machine learning models, and curiosity about low-level performance topics.
Preferred experience includes quantization, model compression, or embedded systems.
Benefits include mentorship, project ownership, and competitive pay. The role emphasizes collaboration, initiative, and innovation in edge computing technology.