Hadoop Data Engineer

Qode

Pennsylvania, PA

JOB DETAILS
SKILLS
Agile Programming Methodologies, Amazon Web Services (AWS), Analysis Skills, Apache HBase, Apache Hadoop, Apache Hive, Apache Spark, Apache Sqoop, Banking Operations, Banking Regulations, Banking Services, Big Data, Cloud Computing, Communication Skills, Computer Science, Data Analysis, Data Lake, Data Management, Data Processing, Data Quality, Data Science, Data Warehousing, Database Extract Transform and Load (ETL), Ecosystems, Electronic Medical Records, Financial Regulations, Financial Services, HDFS (Hadoop Distributed File System), Identify Issues, Informatica, Information Technology & Information Systems, Information/Data Security (InfoSec), Java, Large-Scale Systems, Linux Operating System, MapReduce, Microsoft Windows Azure, NoSQL, PCI, Performance Tuning/Optimization, Problem Solving Skills, Python Programming/Scripting Language, Risk, Risk Management, SQL Databases, Sarbanes-Oxley Act (SOX), Scala Programming Language, Scalable System Development, Scrum Project Management and Software Development, Structured Data, Team Player, Unix Operating Systems, Unstructured Data
LOCATION
Pennsylvania, PA
POSTED
5 days ago

Hadoop Data Engineer responsible for designing, developing, and maintaining large-scale data processing systems within a distributed Hadoop ecosystem. The role focuses on enabling data-driven decision-making across banking operations, risk management, compliance, and customer analytics.

Key Responsibilities

  • Design, develop, and maintain scalable data pipelines using Hadoop ecosystem tools (HDFS, Hive, Spark, Sqoop, Kafka).
  • Build and optimize ETL/ELT processes to support data ingestion from multiple banking systems.
  • Develop and manage big data solutions for structured and unstructured data.
  • Collaborate with data analysts, data scientists, and business stakeholders to deliver data solutions.
  • Ensure data quality, integrity, and governance aligned with banking and regulatory standards.
  • Perform performance tuning and optimization of Hadoop/Spark jobs.
  • Implement data security controls to comply with financial regulations (e.g., PCI, SOX).
  • Support real-time and batch data processing frameworks.
  • Troubleshoot production issues and provide continuous support for data platforms.
  • Work with cloud platforms (e.g., AWS, Azure) for modern data solutions.

Required Skills & Qualifications

Technical Skills

  • Strong experience with:
  • Hadoop ecosystem (HDFS, MapReduce, Hive, HBase)
  • Apache Spark (Scala/Python)
  • SQL & NoSQL databases
  • ETL tools (Informatica, Talend, or similar)
  • Kafka or other streaming tools
  • Proficiency in programming:
  • Python / Java / Scala
  • Experience with:
  • Data warehousing concepts
  • Workflow orchestration tools (Airflow, Oozie)
  • Unix/Linux environments
  • Knowledge of cloud data platforms (AWS EMR, Azure Data Lake) is a plus

Domain Knowledge

  • Understanding of banking and financial services data
  • Exposure to risk, compliance, or fraud analytics is preferred

Soft Skills

  • Strong problem-solving and analytical abilities
  • Excellent communication and collaboration skills
  • Ability to work in Agile/Scrum environments

Education & Experience

  • Bachelor’s or Master’s degree in:
  • Computer Science, Information Technology, or related field
  • Typically 5–10 years of experience in data engineering or big data development


About the Company

Q

Qode