Data Architect OCP (OpenShift) | IAM Data Modernization

PeopleNTech LLC

Alexandria, VA

Apply

JOB DETAILS

SALARY

$85–$90 Per Hour

SKILLS

Access Control, Agile Programming Methodologies, Apache Avro, Architectural Services, Automation, Best Practices, Big Data, CA Workload Automation AE (AutoSys Edition), Cataloguing, Centers for Disease Control and Prevention (CDC), Cloud Architecture, Cloud Computing, Computer Science, Continuous Deployment/Delivery, Continuous Integration, Cost Control, Data Analysis, Data Formats, Data Lake, Data Management, Data Migration, Data Modeling, Data Partitioning, Data Processing, Data Quality, Data Warehousing, DataArchitect Data Modeling Tool, Database Extract Transform and Load (ETL), Design Patterns Programming Methodologies, DevOps, Ecosystems, Enterprise Architecture, Error Handling, GCP (Good Clinical Practices), Identity Data Management, Information Technology & Information Systems, Information/Data Security (InfoSec), Internet Security, Maintain Compliance, Mentoring, Metadata, Metrics, Migration Strategy, People Management, Performance Analysis, Performance Tuning/Optimization, Quality Assurance Methodology, Reporting Dashboards, Requirements Management, SQL (Structured Query Language), Security Auditing, Security Compliance, Standards Development, System Architecture, Team Lead/Manager, Use Cases

LOCATION

Alexandria, VA

POSTED

30+ days ago

Indent :PSL216927_1-26-1
Role : Data Architect – OCP (OpenShift) | IAM Data Modernization
Location : Dallas, TX / Charlotte, NC (Hybrid – 3 days office)
Rate: $85/hr to $90/hr
Project/Program
Identity & Access Management (IAM) Data Modernization
Migration of an on-premises SQL data warehouse to a modern enterprise Data Lake platform, enabling analytics and GenAI use cases. The platform leverages PySpark-based processing, CI/CD pipelines, and containerized deployments on OpenShift (OCP), with GCP as a preferred cloud platform, to deliver scalable, secure, and high-performance data solutions
About Program/Project
The IAM Data Modernization program focuses on transforming legacy data platforms into a scalable and cloud-compatible architecture.
Key Highlights:

Integration Scope: 30+ source systems with multiple downstream integrations [
Capabilities: Metrics, reporting, advanced analytics, and GenAI use cases (NL querying, summarisation, cross-domain insights)
Benefits:
- Scalable and resilient data platform
- High-performance semantic and analytics layer
- Single source of truth for enterprise-wide reporting and analytics

Role Summary
We are looking for a Data Architect with strong expertise in OpenShift (OCP), PySpark, and CI/CD pipelines to design and govern scalable data platforms.
The role requires defining end-to-end data architecture, containerised deployment patterns, orchestration strategies (Airflow/Autosys), and platform standards, along with hands-on involvement in implementation.
Key Responsibilities
Data Architecture & Platform Design

Define enterprise data architecture for IAM data lake and analytics platform
Design scalable, modular, and containerised data pipeline architectures on OCP
Establish data models, schema governance, and data lifecycle strategies
Define best practices for data partitioning, performance optimisation, and cost efficiency

OpenShift (OCP) & Platform Engineering

Architect and govern containerised data workloads on OpenShift (OCP)
Define standards for deployment, scaling, and workload isolation
Collaborate with DevOps teams for platform engineering and infrastructure alignment

Big Data & Processing (PySpark Focus)

Define architecture for PySpark-based batch and near real-time processing pipelines
Provide guidance on distributed processing design, optimisation, and performance tuning
Establish reusable frameworks for ETL/ELT processing

Data Ingestion & Orchestration

Architect data ingestion frameworks (batch, streaming, CDC)
Define orchestration strategies using Airflow / Autosys
Implement standards for retry, backfills, dependency management, and error handling

DevOps / CI-CD

Define and oversee CI/CD strategy for data and platform deployments
Enable automation of build, test, and deployment processes
Ensure integration of CI/CD pipelines with OCP-based environments

Cloud & Data Platforms (Preferred)

Provide architecture guidance for GCP-based data platforms (preferred, not mandatory)
Define integration patterns for cloud-native and on-premise hybrid environments
Guide teams on cloud migration strategies and modern data platform adoption

Data Governance, Quality & Observability

Define frameworks for:
- Data quality, validation, and lineage
- Metadata management and cataloguing
Establish monitoring, logging, alerting, and SLOs for platform reliability
Ensure compliance with data security and audit requirements

Stakeholder Collaboration

Work closely with client architects, IAM teams, and business stakeholders
Translate business requirements into scalable technical architecture
Provide architectural guidance and mentorship to engineering teams

Required Skills
Core Skills (Must Have)

Strong experience in:
- OpenShift (OCP) / Kubernetes-based platforms
- PySpark / Spark ecosystem
- CI/CD implementation for data platforms
- Airflow / Autosys orchestration tools
Solid understanding of:
- Data lake architectures (layered models)
- ETL/ELT design patterns
- Distributed data processing concepts

Data Engineering & Storage

Expertise in:
- Data formats: Parquet, ORC, Avro
- Partitioning and performance tuning
- Large-scale data modelling for analytics

Cloud (Preferred – Not Mandatory)

Experience with Google Cloud Platform (GCP) (preferred)
Exposure to services like BigQuery, Dataproc, Dataflow, GCS is a plus

Observability & Reliability

Experience defining:
- Monitoring, logging, alerting frameworks
- Dashboards, SLOs, and operational runbooks

Good to Have

Experience with IAM domain / cybersecurity data
Understanding of data security and access control frameworks
Exposure to GenAI-enabled data platforms
Experience in Agile delivery and team leadership

Qualifications

Experience:
- 10–14+ years in Data Architecture / Data Engineering
- Strong experience in OCP, PySpark, CI/CD, and orchestration frameworks
- Prior experience in data modernisation / migration programs
Education:
Bachelor's/Master's in Computer Science, Information Systems, or equivalent
Certifications (Preferred):
- OpenShift / Kubernetes certifications
- GCP certifications (preferred, not mandatory)

DISCLAIMER
==========
This e-mail may contain privileged and confidential information which is the property of Persistent Systems Ltd. It is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, you are not authorized to read, retain, copy, print, distribute or use this message. If you have received this communication in error, please notify the sender and delete all copies of this message. Persistent Systems Ltd. does not accept any liability for virus infected mails.

About the Company

PeopleNTech LLC

Resume Resources

Free Resume Templates Free Resume Builder

Data Architect OCP (OpenShift) | IAM Data Modernization

PeopleNTech LLC

Alexandria, VA

About the Company

PeopleNTech LLC

Resume Resources

Similar Job Searches