Technical Architect-Datawarehousing

Tata Consultancy Services Ltd

Marlborough, MA

JOB DETAILS
SALARY
$120,000–$150,000 Per Year
SKILLS
Access Control, Amazon Simple Storage Service (S3), Amazon Web Services (AWS), Apache Spark, Best Practices, Caching, Cisco Unity, Cloud Computing, Code Reviews, Continuous Deployment/Delivery, Continuous Integration, Data Analysis, Data Management, Data Modeling, Data Quality, Data Science, Data Sets, Data Storage, Data Warehousing, Database Extract Transform and Load (ETL), Debugging Skills, Design Patterns Programming Methodologies, DevOps, Dimensional Modeling, Ecosystems, Engineering, GCP (Good Clinical Practices), GitHub, Information/Data Security (InfoSec), Maintain Compliance, Marketing, Microsoft Windows Azure, Performance Tuning/Optimization, Python Programming/Scripting Language, SQL (Structured Query Language), Sales, Security Compliance, Team Player, eCommerce
LOCATION
Marlborough, MA
POSTED
30+ days ago

Databricks Architect

Must Have Technical/Functional Skills

Experience: 5+ years of hands-on data engineering experience, with at least 3 years focused on the Databricks/Spark

Ecosystem

Databricks Expertise: Deep, hands-on expertise with the Databricks Lakehouse Platform, including Delta Lake,

Structured Streaming, Delta Live Tables, and cluster configuration/optimization.

Programming Mastery: Expert-level proficiency in Python and PySpark. Advanced SQL skills are essential.

Data Warehousing Concepts: Strong understanding of data modeling principles, including dimensional modeling

(Kimball), data warehousing concepts, and ETL/ELT design patterns.

Cloud Proficiency: Proven experience working with a major cloud provider (Azure, AWS, or GCP), particularly with

data storage S3 and related services.

Software Engineering Mindset: Experience with software engineering best practices, including version control (Git),

code reviews, testing, and CI/CD.

Roles and Responsibilities

Data Pipeline Development: Design, code, and deploy robust and scalable batch and streaming data pipelines

using PySpark, Spark SQL, and Delta Live Tables to ingest data from sources such as Point-of-Sale (POS), e-commerce

platforms, loyalty systems, and marketing clouds.

Data Modeling and Transformation: Implement complex data transformations and business logic within the Medallion

architecture (Bronze, Silver, Gold layers). Build and optimize the final "Gold" customer-dimension tables that will

serve as the single source of truth.

Data Quality: Implement data quality frameworks and cleansing routines to ensure the accuracy and trustworthiness

of the Customer 360 data.

Performance Optimization: Proactively monitor, debug, and tune Databricks jobs and Spark clusters for performance

and cost-efficiency. Implement best practices for partitioning, caching, and data layout in Delta Lake.

Infrastructure as Code (IaC) & CI/CD: Work with DevOps teams to manage Databricks environments, clusters, and

job deployments using tools like Terraform and AWS DevOps/GitHub Actions. Champion and implement CI/CD best

practices for data pipelines.

Data Governance and Security: Implement data governance features within Databricks Unity Catalog, including

data lineage tracking, access controls, and data masking to ensure compliance and security.

Collaboration: Partner closely with Functional Consultants, Data Scientists, and Analytics Engineers to understand

their data requirements and deliver well-structured, consumption-ready datasets.

Education

Bachelors

Salary Range: $120000 - $150000 a year

About the Company

T

Tata Consultancy Services Ltd