Multimodal Model Training and Inference Optimization Engineer

Beijing ByteDance Technology Co Ltd

San Jose, CA

JOB DETAILS
SKILLS
Artificial Intelligence (AI), Benchmarking, C++ Programming Language, CUDA (Compute Unified Device Architecture), Communication Skills, Computer Science, Conferences, Content Development, Cross-Functional, Data Modeling, Deep Learning, Electrical Engineering, Engineering, Image Manipulation, Performance Modeling, Performance Tuning/Optimization, Problem Solving Skills, Publications, Python Programming/Scripting Language, Software Engineering, Team Player, Video Editing
LOCATION
San Jose, CA
POSTED
30+ days ago

About the team

The Vision-Applied Research team focuses on applied research in Generative AI and CV/Multimodal Understanding, and delivering intelligent solutions to ByteDance products, e.g., TikTok, CapCut, and Lemon8, enabling users to make and share creative content in a much easier way. The team has research groups dedicated to generative models for content creation, image generation, video synthesis, intelligent image/video editing, and virtual humans.

We are seeking an experienced Multimodal Model Training and Inference Optimization Engineer with expertise in optimizing AI model training and inference, including distributed training/inference and acceleration. The ideal candidate will work at the cutting edge of AI efficiency, enhancing the performance, scalability, and deployment of large-scale generative AI models.

Responsibilities

  • Optimize large model training pipelines to improve efficiency, speed, and scalability.
  • Develop and improve distributed training strategies such as data parallelism, model parallelism, pipeline parallelism and communication to accelerate model training.
  • Benchmark and profile deep learning models to identify performance bottlenecks and optimize computational resources.

Minimum Qualifications

  • M.S or PhD in Computer Science, Electrical Engineering, Artificial Intelligence, or a related field.
  • Experience in AI model training optimization.
  • Strong software engineering skills, including proficiency in Python, C++, and CUDA.
  • Strong proficiency in deep learning frameworks such as PyTorch, Megatron and Deepspeed.
  • Experience with distributed training techniques such as data parallelism, model parallelism, and pipeline parallelism.
  • Knowledge of transformers and diffusion models.

Preferred Qualifications

  • Candidates with publications at conferences such as MLSys, NeurIPS, ICLR, or ICML are preferred.
  • Strong communication and teamwork skills.
  • Self-motivated and strong problem-solving skills.
  • Ability to work collaboratively in multi-functional teams.
  • Experienced in implementing and optimizing complex and performance-critical systems.

About the Company

B

Beijing ByteDance Technology Co Ltd