Required Qualifications
- Bachelor’s degree in computer science, business, or a related analytical field or equivalent experience
- 5+ years experience in analysis, design, and coding
- 7+ years building scalable, production-grade data platforms and pipelines
- 5+ years Apache Spark / PySpark (batch processing, performance tuning, optimization)
- 5+ years ETL/ELT development, data modeling, and transformation patterns/frameworks
- 3+ years Azure Databricks (jobs/workflows, clusters, environment management)
- 2+ years data governance/catalog concepts in Databricks (Unity Catalog permissions/RBAC, auditing/lineage concepts)
- Strong Delta Lake / Lakehouse experience (Bronze/Silver/Gold, MERGE, schema evolution, OPTIMIZE/Z-ORDER basics)
- Strong SQL (complex queries, tuning for large datasets, reconciliation)
- Azure fundamentals for data engineering (ADLS Gen2, identity/service principals/managed identity, secrets/Key Vault)
- Hands-on experience building/operating Data Quality Engineering (DQE): validation rules, reconciliations, and automated quality gates in pipelines
Preferred Qualifications
- Experience of warehousing processes and systems
- Experience of infrastructure and systems concepts
- Experience performing data analysis through creating and executing queries, interpreting results, and communicating findings to technical and business audiences
- Experience working as a Scrum Master within an Agile Scrum framework and its application in product development and delivery
- Experience utilizing Agile project management and code management tools (e.g., Azure DevOps or Jira)
- Experience managing 3rd party vendor relationships is an advantage
- Experience with Spark declarative pipelines (Databricks declarative pipeline patterns; formerly DLT/Delta Live Tables) including expectations-style rules and incremental processing.
- Experience implementing DQE patterns: completeness/accuracy checks, anomaly detection, and data observability/monitoring.
- Autoloader / incremental ingestion patterns and schema drift handling.
- Experience reading Kafka / Event Hubs in DLT pipelines and streaming exposure (Structured Streaming; Event Hubs/Kafka).
Publix Technology