$180,000–$220,000 Per Year
Analysis Skills, Artificial Intelligence (AI), Benchmarking, Concrete, Data Analysis, Data Management, Data Sets, Data Structures, Diversity, Engineering, Experiment Design, Finance, Performance Management, Performance Modeling, Product Engineering, Production Systems, Quality Metrics, Reinforcement Learning, Safety/Work Safety, Software Engineering, Startup, Team Player, Training Data Sets
Software Engineer – RL Environments — AfterQuery
Location: San Francisco, CA (Onsite)
Compensation: $180,000 – $220,000 base | ~$500,000 total cash + equity
About AfterQuery
AfterQuery is an AI infrastructure company building training data and evaluation systems for frontier AI labs. They work directly with leading labs to improve model performance through datasets and experimentation. $30M raised at ~$300M valuation. Founding team from Jane Street, Citadel, Google, Goldman Sachs, and Stanford AI Lab.
About the Role
This is a high-impact engineering role focused on building the datasets, evaluation systems, and reward frameworks that directly influence how frontier AI models are trained. You will operate at the intersection of software engineering, data pipelines, and reinforcement learning environments. Your output will directly impact model capability, alignment, and performance across real-world domains.
What You'll Own
- Design data slices that expose meaningful model failure modes across domains including finance, code, and enterprise workflows
- Build and refine evaluation rubrics and reward signals for RLHF and RLVR pipelines
- Run experiments to analyze model behavior and improve capabilities
- Develop frameworks to measure dataset quality, diversity, and downstream impact
- Build and manage real-world and synthetic data pipelines
- Work directly with research teams at leading AI labs — translating training objectives into concrete data and evaluation systems
Requirements
- 1–4 years of software engineering experience with strong technical depth
- Strong interest in how data structure and quality influence model behavior
- Ability to design experiments and extract insights from imperfect data
- Experience building and shipping production systems
- Comfort working across domains including finance, engineering, and policy
Nice to Have
- Experience with RL environment companies, AI safety, or benchmarking organizations
- Experience building data pipelines and working with ML infrastructure
- Familiarity with RLHF or RLVR training pipelines
- Startup or early engineer experience
This Role Is NOT For
- Pure research profiles without engineering output
- Those who prefer traditional product engineering work
- Candidates unable to operate in ambiguous environments
Logistics
- Role is fully onsite in San Francisco — please only apply if you can commit to this
- Multiple headcount with active hiring demand
Shortlisted candidates will be contacted by David Joseph & Co., the recruiting partner managing this search on behalf of AfterQuery.