Senior Data Engineer - INDIA

Vytwo

Dallas, TX

Apply

JOB DETAILS

SKILLS

Amazon Web Services (AWS), Analysis Skills, Apache, Application Integration, Architectural Analysis, Best Practices, Cataloguing, Cloud Computing, Code Reviews, Continuous Deployment/Delivery, Continuous Integration, Cross-Functional, Data Analysis, Data Lake, Data Management, Data Modeling, Data Processing, Data Quality, Data Science, Data Sets, Data Warehousing, Database Extract Transform and Load (ETL), Design Patterns Programming Methodologies, DevOps, Distributed Computing, Docker, Ecosystems, Environmental Management, GCP (Good Clinical Practices), HL7 (Health Level 7), Healthcare, Identify Issues, Machine Learning, Metadata, Microsoft Product Family, Microsoft Windows Azure, OLAP (OnLine Analytical Processing), Performance Tuning/Optimization, Problem Solving Skills, Production Control, Programming Languages, Python Programming/Scripting Language, SQL (Structured Query Language), Scalable System Development, Software Engineering, Source Code/Configuration Management (SCM), Stored Procedures, Technical Leadership, Technical/Engineering Design, Test Automation, Transaction Processing/Management, Work From Home

LOCATION

Dallas, TX

POSTED

30+ days ago

Role: Senior Data Engineer
Remote Work: INDIA
Location: Hyderabad / Noida, INDIA
*Only Consultants local to INDIA are eligible.
*No visa Sponsorship

Primary Responsibilities:

Design, develop, and maintain scalable data pipelines using Python, PySpark, and other modern programming languages to support both batch and streaming workloads
Build and optimize data processing frameworks on cloud platforms such as Databricks or Snowflake, ensuring performance, reliability, and cost efficiency
Design and implement robust data models, including transactional (OLTP) and dimensional (OLAP) schemas, to support analytics, reporting, and application integration
Develop high quality SQL code including complex queries, stored procedures, and views, with a focus on performance tuning and efficient data access patterns
Create and manage workflow orchestration using Apache Airflow or similar tools, ensuring reliable scheduling, dependency management, and monitoring
Implement and enforce data governance and metadata standards through tools such as Microsoft Purview, including data lineage, classification, cataloging, and security policies
Build automated data quality and validation frameworks to ensure accuracy, completeness, and reliability of production datasets
Collaborate with cross functional teams including data architects, analysts, scientists, and business stakeholders to understand requirements and deliver scalable, well designed data solutions
Lead technical design sessions and code reviews, promoting engineering best practices, reusability, and maintainability
Support cloud infrastructure and DevOps practices, including CI/CD pipelines, version control, testing automation, and environment management
Monitor and troubleshoot production data pipelines, proactively addressing issues, performance bottlenecks, and system failures
Contribute to the evolution of the enterprise data platform, recommending tools, frameworks, and architectures to improve scalability and efficiency

Required Qualifications:

5+ years of experience in data engineering, software engineering, or similar disciplines
Hands-on experience with Databricks or Snowflake
Experience with orchestration tools such as Apache Airflow
Experience working with cloud ecosystems (Azure preferred; AWS/GCP acceptable)
Advanced SQL skills and experience with OLTP and OLAP data modeling
Solid understanding of modern data warehousing, data lake, and ELT/ETL design patterns
Familiarity with data governance tools, especially Microsoft Purview
Solid programming expertise in Python, PySpark, or similar languages

Preferred Qualifications:

Healthcare industry experience, including claims, clinical, FHIR, HL7, or provider data
Experience with containerization (Docker, Kubernetes) for data workloads
Experience supporting machine learning workflows or analytical data science pipelines
Knowledge of distributed computing concepts and performance tuning

About the Company

Vytwo

Resume Resources

Free Resume Templates Free Resume Builder