Senior Software Engineer / Researcher, AI-Native database systems

Beijing ByteDance Technology Co Ltd

san jose, CA

JOB DETAILS
SKILLS
Amazon Web Services (AWS), Artificial Intelligence (AI), Artificial Intelligence (AI) Agents, C++ Programming Language, Computer Programming, Computer Science, Cost Estimates, Database Design, Database Optimization, Database Technology, Distributed Computing, GCP (Good Clinical Practices), Go Programming Language (Golang), Large-Scale Systems, Machine Learning, Memory Hardware, Microsoft Windows Azure, Modeling Languages, Open Source, Product Support, Production Systems, Relational Databases (RDBMS), Research & Development (R&D), Research Skills, Rust Programming Language, Scalable System Development, Search Agent, Search Engines, Semantic Reasoner, Semantic Search, Software Engineering, Storage Architecture, System Integration (SI), Team Player, Unstructured Data
LOCATION
san jose, CA
POSTED
30+ days ago

About the Team Join ByteDance's database R&D team, where you'll build and own cutting-edge database products supporting Bytedance's global infrastructure. Our diverse portfolio includes relational databases, distributed caches, key-value stores, document databases, graph databases, wide-column stores, search engines, and multi-model databases. In this role, you'll have the opportunity to enhance these services in a cloud-native environment, embracing a culture of intellectual curiosity, self-direction, and problem-solving.

We are building the next-generation AI-native database systems-intelligent, multimodal, and designed for the era of large models. Our systems are not just data stores; they're reasoning engines, retrieval platforms, and real-time memory for AI agents. As a Senior Software Engineer or Researcher, you will be at the forefront of rethinking how databases work when built from the ground up for AI workloads. You'll help create infrastructure that powers intelligent systems across TikTok, CapCut, and future applications that haven't been imagined yet.

Responsibilities

  • Architect and implement AI-native databases that seamlessly integrate structured, unstructured, and vectorized data.
  • Design storage engines optimized for embedding ingestion, multimodal retrieval, and real-time AI interaction.
  • Build scalable and distributed vector search systems with low-latency guarantees.
  • Develop AI-augmented query processors that leverage large language models (LLMs) for semantic parsing, intent understanding, and cost estimation.
  • Collaborate on developing retrieval-augmented generation (RAG) infrastructure and LLM agent memory backends.
  • Drive innovations in learned index structures, self-optimizing databases, and AI-integrated transaction systems.
  • Publish and contribute to broader research and open-source communities.Minimum Qualifications:
  • Bachelor's, Master's, or Ph.D. in Computer Science or related fields with strong systems or AI research experience.
  • 2+ years in core database systems, large-scale distributed infrastructure, or machine learning systems.
  • Strong coding and system-level design skills in C++ / Rust / Go.
  • Deep expertise in one or more of the following areas: Storage engine architecture (LSM-trees, column stores, HTAP systems) / Vector retrieval systems, similarity search, and ANN indexing / AI infra or model-serving infrastructure (especially for embeddings / RAG / LLMs) / Semantic search, agent systems, or AI-native memory frameworks
  • Ability to collaborate across research, engineering, and product teams to translate ideas into production systems.

Preferred Qualifications:

  • Experience with open-source systems such as Faiss, Milvus, DuckDB, ClickHouse, TiKV, RocksDB.
  • Publications at top-tier conferences (e.g., SIGMOD, VLDB, NeurIPS, MLSys, ICDE).
  • Familiarity with GCP, AWS, or Azure's database and AI integration strategies.
  • Prior contributions to RAG, memory-augmented models, or self-tuning database components.

About the Company

B

Beijing ByteDance Technology Co Ltd