Google Cloud Data Architect & IAM Data Modernization

Retail Industry

Dallas, Texas

Apply

JOB DETAILS

SKILLS

Access Control, Apache, Apache Avro, Apache Hadoop, Apache Hive, Apache Pig, Apache Sqoop, Application Programming Interface (API), Automation, Big Data, Business Intelligence, Business Services, Business Transformation, Centers for Disease Control and Prevention (CDC), Cloud Architecture, Cloud Computing, Cloud Storage, Communication Skills, Computer Programming, Computer Science, Consulting, Continuous Deployment/Delivery, Continuous Integration, Cost Control, Data Analysis, Data Formats, Data Lake, Data Management, Data Migration, Data Modeling, Data Processing, Data Quality, Data Sets, Data Warehousing, DataArchitect Data Modeling Tool, Database Design, Database Extract Transform and Load (ETL), DevOps, Distributed Computing, Ecosystems, Enterprise Application Integration (EAI), Environmental Management, Error Handling, File Systems, GCP (Good Clinical Practices), Git, HDFS (Hadoop Distributed File System), Identify Issues, Identity Data Management, Information Technology & Information Systems, Information Technology Consulting, Information/Data Security (InfoSec), Management Strategy, MapReduce, Metadata, Metrics, Performance Tuning/Optimization, Python Programming/Scripting Language, Query Optimization, Reporting Dashboards, SQL (Structured Query Language), Scalable System Development, Service Level Agreement (SLA), Software Administration, Software Development, Software Engineering, Source Code/Configuration Management (SCM), Sprint Planning, System Integration (SI), Testing, Trend Analysis, United States Citizen, Use Cases, Validation Testing, eBusiness

LOCATION

Dallas, Texas

POSTED

22 days ago

Role: Google Cloud Data Architect – IAM Data Modernization

Location: Dallas, TX / Charlotte, NC/ Iselin, NJ, / Chandler, AZ / Ohio, Delaware (Hybrid)

*Must be a US Citizen/ GC only

About Position:

Identity & Access Management (IAM) Data Modernization – migration of an on‑premises SQL data warehouse to a target‑state Data Lake on Google Cloud (GCP), enabling metrics & reporting, advanced analytics, and GenAI use cases (natural language querying, accelerated summarization, cross‑domain trend analysis) leveraging PySpark‑based processing, cloud‑native DevOps CI/CD pipelines, and containerized deployments on OpenShift (OCP) to deliver scalable, secure, and high‑performance data solutions.

What You'll Do:

DevOps / CI‑CD

Experience implementing CI/CD pipelines for data and analytics workloads
Familiarity with Git‑based source control, build automation, and deployment strategies

Containers & Platform

Experience with OpenShift Container Platform (OCP) for deploying data workloads and services
Understanding of containerized architecture, scaling, and environment management
Proven ability to build CI/CD pipelines for data and infrastructure workloads
Experience managing secrets securely using GCP Secret Manager
Ownership of observability, SLOs, dashboards, alerts, and runbooks
Proficiency in logging, monitoring, and alerting for data pipelines and platform reliability

Big Data & Processing

Hands‑on experience with PySpark for ETL/ELT, data transformation, and performance optimization
Solid understanding of distributed data processing concepts

Data & Cloud Architecture

Strong experience designing data platforms on Google Cloud Platform (GCP)
Experience with Data Lakes, data warehousing, and large‑scale migration programs

Data Lake Architecture & Storage

Proven experience designing and implementing data lake architectures (e.g., Bronze/Silver/Gold or layered models).
Strong knowledge of Cloud Storage (GCS) design, including bucket layout, naming conventions, lifecycle policies, and access controls

· Experience with Hadoop/HDFS architecture, distributed file systems, and data locality principles

Hands-on experience with columnar data formats (Parquet, Avro, ORC) and compression techniques
Expertise in partitioning strategies, backfills, and large-scale data organization
Ability to design data models optimized for analytics and BI consumption

Data Ingestion & Orchestration

· Experience building batch and streaming ingestion pipelines using GCP-native services

· Knowledge of Pub/Sub-based streaming architectures, event schema design, and versioning

· Strong understanding of incremental ingestion and CDC patterns, including idempotency and deduplication

· Hands-on experience with workflow orchestration tools (Cloud Composer / Airflow)

· Ability to design robust error handling, replay, and backfill mechanisms

Data Processing & Transformation

· Experience developing scalable batch and streaming pipelines using Dataflow (Apache Beam) and/or Spark (Dataproc)

· Strong proficiency in BigQuery SQL, including query optimization, partitioning, clustering, and cost control.

· Hands-on experience with Hadoop MapReduce and ecosystem tools (Hive, Pig, Sqoop)

· Advanced Python programming skills for data engineering, including testing and maintainable code design

· Experience managing schema evolution while minimizing downstream impact

Analytics & Data Serving

· Expertise in BigQuery performance optimization and data serving patterns

· Experience building semantic layers and governed metrics for consistent analytics

· Familiarity with BI integration, access controls, and dashboard standards

· Understanding of data exposure patterns via views, APIs, or curated datasets

Data Governance, Quality & Metadata

· Experience implementing data catalogs, metadata management, and ownership models

· Understanding of data lineage for auditability and troubleshooting

· Strong focus on data quality frameworks, including validation, freshness checks, and alerting

· Experience defining and enforcing data contracts, schemas, and SLAs

Good to have

Security, Privacy & Compliance

· Hands-on experience implementing fine-grained access controls for BigQuery and GCS

· Experience with Sprint planning and helping team technically.

· Strong stakeholder communication and solution‑architecture skills

Expertise You'll Bring:

Experience: [10–14]+ years in DevOps and Data Architecture, 5+ years designing on Pyspark/GCP/OCP at scale; prior on‑prem cloud migration a must.
Education: Bachelor’s/Master’s in Computer Science, Information Systems, or equivalent experience.
Certifications:Google Cloud Professional Cloud Architect/DevOps/OCP (required or within 3 months). Plus: Professional Data Engineer, Security Engineer

Flexible work from home options available.

We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability status, protected veteran status, or any other characteristic protected by law.

Vytwo Technologies is a global leader in enterprise application integration, delivering end-to-end IT consulting and business services for mid to large-scale organizations.

We offer a comprehensive suite of solutions including business and technology consulting, Cloud,e-business and digital transformation services, system integration, custom application development, re-engineering, and long-term application support.

About the Company

Retail Industry

Resume Resources

Free Resume Templates Free Resume Builder

Google Cloud Data Architect & IAM Data Modernization

Retail Industry

Dallas, Texas

*Must be a US Citizen/ GC only

About the Company

Retail Industry

Resume Resources

Similar Job Searches