| Role Description: | Druid Data Modeling Schema DesignoDesign and implement efficient data schemas, dimensions, and metrics within Apache Druid for various analytical use cases (e.g., clickstream, IoT, application monitoring).oDetermine optimal partitioning, indexing (bitmap indexes), and rollup strategies to ensure sub-second query performance and efficient storage.Data Ingestion Pipeline DevelopmentoDevelop and manage real-time data ingestion pipelines into Druid from streaming sources like Apache Kafka, Amazon Kinesis, or other message queues.oImplement batch data ingestion processes from data lakes (e.g., HDFS, Amazon S3, Azure Blob, Google Cloud Storage) or other databases.oEnsure data quality, consistency, and exactly-once processing during ingestion.Query Optimization Performance TuningoWrite and optimize complex SQL queries (Druid SQL) for high-performance analytical workloads, including aggregations, filters, and time-series analysis.oAnalyze query plans and identify performance bottlenecks, implementing solutions such as segment optimization, query rewriting, or cluster configuration adjustments. |
|---|