Research Engineer - Agent Training Infrastructure (Seed Infra)

Beijing ByteDance Technology Co Ltd

Seattle, WA

Apply

JOB DETAILS

SKILLS

Artificial Intelligence (AI), Benchmarking, Computer Science, Cross-Functional, Debugging Skills, Distributed Computing, Environmental Management, GPU (Graphics Processing Unit), Machine Learning, Mentoring, Open Source, Performance Management, Performance Tuning/Optimization, Prototyping, Publications, Python Programming/Scripting Language, Reinforcement Learning

LOCATION

Seattle, WA

POSTED

30+ days ago

About the Team

The Seed Infrastructures team oversees the distributed training, reinforcement learning framework, high-performance inference, and heterogeneous hardware compilation technologies for AI foundation models.

Responsibilities

Design, implement, and maintain agent execution environments and runtime frameworks for multi-agent training at scale
Build and optimize infrastructure for RLHF pipelines, reward modeling, and distributed RL training
Manage and orchestrate many-agent parallel execution, including environment simulations and environment managers
Collaborate closely with research teams to support the LLM training pipeline: training ? SFT ? RLHF ? evaluation ? serving
Ensure high-performance, scalable, and fault-tolerant distributed systems for agent frameworks
Develop tools and libraries to monitor, debug, and benchmark agent training and inference
Translate research prototypes into production-ready infrastructure that can support large-scale AI experiments

Minimum Qualifications

M.S. or Ph.D. in Computer Science, Machine Learning, or a related field
Strong experience with Python and distributed systems frameworks (e.g., Ray)
Hands-on experience building agent infrastructure: execution environment, runtime, or environment manager
Experience managing parallel multi-agent execution, including simulations and environment orchestration
Familiarity with the LLM pipeline (training ? SFT ? RLHF ? evaluation ? serving)
Proven ability to design and maintain high-performance, scalable, and robust distributed AI systems

Preferred Qualifications

Experience building or contributing to RLHF pipelines, reward modeling infrastructure, or RL training infrastructure
Strong understanding of multi-agent reinforcement learning and agent orchestration at scale.
Familiarity with GPU clusters, distributed training strategies, and performance optimization
Publications or open-source contributions in agent systems, distributed RL, or AI infrastructure.
Experience mentoring engineers and collaborating in cross-functional research and engineering teams

About the Company

Beijing ByteDance Technology Co Ltd

Resume Resources

Free Resume Templates Free Resume Builder

Research Engineer - Agent Training Infrastructure (Seed Infra)

Beijing ByteDance Technology Co Ltd

Seattle, WA

About the Company

Beijing ByteDance Technology Co Ltd

Resume Resources

Similar Job Searches