Quadric has created an innovative general purpose neural processing unit (GPNPU) architecture. Quadrics co-optimized software and hardware is targeted to run neural network (NN) inference workloads in a wide variety of edge and endpoint devices, ranging from battery operated smart-sensor systems to high-performance automotive or autonomous vehicle systems. Unlike other NPUs or neural network accelerators in the industry today that can only accelerate a portion of a machine learning graph, the Quadric GPNPU executes both NN graph code and conventional C++ DSP and control code.
As a Senior Performance Architect, you will be the critical link between software and hardware, responsible for understanding how code executes on Quadrics architecture and identifying opportunities for optimization. You will analyze workloads from high-level C++ and Python down through generated assembly to pinpoint performance bottlenecks. This is a hands-on role: beyond analysis, you will prototype solutions yourself - whether that means writing optimized code, modifying compiler passes, or building proof-of-concept implementations to validate proposed fixes before handing off to the appropriate team for productization.
Responsibilities
Analyze application performance across the full stack: C++/Python source, compiler output, assembly, and hardware execution
Identify and localize performance bottlenecks to specific code regions, assembly sequences, or architectural limitations
Implement proof-of-concept fixes and optimizations to validate proposed solutions before broader rollout
Develop and maintain profiling infrastructure, benchmarks, and performance regression tests
Collaborate with compiler engineers to improve code generation and optimization passes
Work with hardware architects to identify microarchitectural improvements and validate performance models
Create performance models that predict workload behavior and guide optimization priorities
Document findings and communicate performance insights to both technical and non-technical stakeholders
Support customer engagements by analyzing their workloads and recommending optimizations
Work Requirements
This role requires regular work from the Quadric office in Burlingame, CA, a minimum of 2-3 days per week, with some weeks requiring more days onsite based on business needs. Candidates must be able to commute to the office.