This role will provide expertise to support the development of a Big Data / Data Lake system architecture that supports enterprise data operations for the District of Columbia government, including the Internet of Things (IoT) / Smart City projects, enterprise data warehouse, the open data portal, and data science applications. This architecture includes an Databricks, Microsoft Azure platform tools (including Data Lake, Synapse), Apache platform tools (including Hadoop, Hive, Impala, Spark, Sedona, Airflow) and data pipeline/ETL development tools (including Streamsets, Apache NiFi, Azure Data Factory).