Amazon Simple Storage Service (S3), Amazon Web Services (AWS), Apache, Application Programming Interface (API), Architectural Design, Artificial Intelligence (AI), Centers for Disease Control and Prevention (CDC), Cloud Computing, Computer Programming, Cross-Functional, Data Collection, Data Management, Data Modeling, Data Processing, Data Warehousing, Database Extract Transform and Load (ETL), Distributed Computing, Ecosystems, Emerging Technology, Java, Kotlin, Leading Edge Technology, Master Data Management (MDM), MySQL, Operational Audit, Performance Management, Performance Modeling, Performance Tuning/Optimization, PostgreSQL, Python Programming/Scripting Language, Relational Databases (RDBMS), SQL (Structured Query Language), Scalable System Development, Snowflake Schema, Software Administration, Software Engineering, System Architecture, Systems Analysis, Team Lead/Manager, Team Player, Use Cases, Workflow Analysis
Senior Software Engineer – Data Platform
San Francisco, CA (Hybrid 2 days on-site)
This is a hybrid position. Candidates considered for this role will be located in San Francisco, CA.
Job Overview
Our client is a mission-driven, global technology platform focused on connecting individuals, communities, and organizations at scale. The platform enables real-time interactions, trusted transactions, and data-driven decision-making across a rapidly growing ecosystem.
We are seeking a Senior Software Engineer – Data Platform to join a high-impact Data Engineering team responsible for building and scaling a next-generation, cloud-native data platform. This role will focus on developing event-driven, distributed systems that power analytics, operational workflows, and AI-driven use cases across the organization.
You will play a key role in designing foundational systems such as Master Data Management (MDM) and evolving a modern event-driven architecture, enabling scalable, reliable, and high-performance data solutions.
This is a hands-on, high-impact engineering role with strong influence on system design, architecture, and platform evolution.
This is a hybrid position based in Buenos Aires, Argentina.
The Role
- Build and scale core services for an event-driven, source-of-truth data platform on AWS
- Design and develop Kafka-based streaming pipelines, event producers, and consumers
- Architect systems where event streams serve as the system of record, enabling replay, recovery, and state reconstruction
- Define and implement patterns for idempotency, ordering, retries, and dead-letter queues (DLQs)
- Design for auditability, replayability, and fault recovery using scalable storage solutions (e.g., S3 with Iceberg-style architectures)
- Develop APIs and read-optimized data models (projections) to support downstream applications and services
- Build and evolve highly scalable, reliable distributed systems, balancing consistency, latency, and cost
- Establish and promote best practices for event-driven architecture and data modeling
- Develop and orchestrate streaming and ELT pipelines across databases, APIs, and event streams
- Contribute to data warehouse integrations (e.g., Snowflake or similar) and downstream data activation use cases
- Integrate AI/LLM capabilities into data workflows and internal data products
- Monitor, optimize, and improve system performance, cost efficiency, and reliability
- Collaborate cross-functionally with engineering, data, and product teams to deliver scalable solutions
Required Qualifications
- 7+ years of experience in backend engineering or data engineering
- Strong programming skills in Java, Kotlin, Python, or Go
- Solid understanding of distributed systems and system design principles
- Hands-on experience with Kafka or similar event streaming platforms
- Deep understanding of:
- Event-driven architecture and system design
- Topic design, partitioning, and consumer group management
- Idempotency, ordering guarantees, and delivery semantics
- Replay, backfills, and failure recovery strategies
- Experience designing systems where event streams drive state and downstream projections
- Strong experience with AWS (e.g., MSK, S3, or equivalent services)
- Working knowledge of modern data warehouses (e.g., Snowflake or similar)
- Strong SQL skills and experience building ELT and streaming data pipelines
- Experience with relational databases such as PostgreSQL or MySQL
- Strong data modeling and performance optimization skills
- Ability to design systems from scratch and evolve them over time
- Familiarity with AI/LLM concepts (e.g., embeddings, RAG, prompt design)
- Strong ownership mindset with the ability to deliver end-to-end solutions
Preferred Qualifications
- Experience with Apache Iceberg or similar lakehouse technologies
- Familiarity with stream processing frameworks (e.g., Flink)
- Experience with Change Data Capture (CDC) tools (e.g., Debezium)
- Exposure to modern data stack tools (e.g., dbt, Reverse ETL platforms)
- Experience with infrastructure as code (e.g., Terraform) and container orchestration (e.g., Kubernetes/EKS)
- Experience with Master Data Management (MDM), identity systems, or data governance frameworks
- Familiarity with vector databases or AI-powered data systems
Why You'll Love This Opportunity
- High Impact: Build foundational systems that power analytics, operations, and AI at scale
- Cutting-Edge Technology: Work with modern event-driven architectures, streaming systems, and AI-enabled platforms
- Collaborative Culture: Partner with talented teams across engineering, data, and product
- Growth Opportunities: Expand your expertise in distributed systems, data platforms, and emerging technologies
- Competitive Compensation & Benefits: Attractive salary and comprehensive benefits package
- Flexible Work Model: Hybrid environment supporting work-life balance
What Sets This Role Apart
This is a unique opportunity to help build a modern, event-driven data platform from the ground up, shaping how data flows, scales, and powers critical business and AI-driven capabilities. You will work at the forefront of distributed systems, real-time data processing, and next-generation data architecture, with the ability to influence both technical direction and long-term platform strategy.