Senior Data Engineer - INDIA

Vytwo

Dallas, TX

JOB DETAILS
SKILLS
Amazon Web Services (AWS), Analysis Skills, Apache, Application Integration, Architectural Analysis, Best Practices, Cataloguing, Cloud Computing, Code Reviews, Continuous Deployment/Delivery, Continuous Integration, Cross-Functional, Data Analysis, Data Lake, Data Management, Data Modeling, Data Processing, Data Quality, Data Science, Data Sets, Data Warehousing, Database Extract Transform and Load (ETL), Design Patterns Programming Methodologies, DevOps, Distributed Computing, Docker, Ecosystems, Environmental Management, GCP (Good Clinical Practices), HL7 (Health Level 7), Healthcare, Identify Issues, Machine Learning, Metadata, Microsoft Product Family, Microsoft Windows Azure, OLAP (OnLine Analytical Processing), Performance Tuning/Optimization, Problem Solving Skills, Production Control, Programming Languages, Python Programming/Scripting Language, SQL (Structured Query Language), Scalable System Development, Software Engineering, Source Code/Configuration Management (SCM), Stored Procedures, Technical Leadership, Technical/Engineering Design, Test Automation, Transaction Processing/Management, Work From Home
LOCATION
Dallas, TX
POSTED
30+ days ago
Role: Senior Data Engineer
Remote Work: INDIA
Location: Hyderabad / Noida, INDIA
*Only Consultants local to INDIA are eligible.

*No visa Sponsorship


Primary Responsibilities:


  • Design, develop, and maintain scalable data pipelines using Python, PySpark, and other modern programming languages to support both batch and streaming workloads
  • Build and optimize data processing frameworks on cloud platforms such as Databricks or Snowflake, ensuring performance, reliability, and cost efficiency
  • Design and implement robust data models, including transactional (OLTP) and dimensional (OLAP) schemas, to support analytics, reporting, and application integration
  • Develop high quality SQL code including complex queries, stored procedures, and views, with a focus on performance tuning and efficient data access patterns
  • Create and manage workflow orchestration using Apache Airflow or similar tools, ensuring reliable scheduling, dependency management, and monitoring
  • Implement and enforce data governance and metadata standards through tools such as Microsoft Purview, including data lineage, classification, cataloging, and security policies
  • Build automated data quality and validation frameworks to ensure accuracy, completeness, and reliability of production datasets
  • Collaborate with cross functional teams including data architects, analysts, scientists, and business stakeholders to understand requirements and deliver scalable, well designed data solutions
  • Lead technical design sessions and code reviews, promoting engineering best practices, reusability, and maintainability
  • Support cloud infrastructure and DevOps practices, including CI/CD pipelines, version control, testing automation, and environment management
  • Monitor and troubleshoot production data pipelines, proactively addressing issues, performance bottlenecks, and system failures
  • Contribute to the evolution of the enterprise data platform, recommending tools, frameworks, and architectures to improve scalability and efficiency

Required Qualifications:


  • 5+ years of experience in data engineering, software engineering, or similar disciplines
  • Hands-on experience with Databricks or Snowflake
  • Experience with orchestration tools such as Apache Airflow
  • Experience working with cloud ecosystems (Azure preferred; AWS/GCP acceptable)
  • Advanced SQL skills and experience with OLTP and OLAP data modeling
  • Solid understanding of modern data warehousing, data lake, and ELT/ETL design patterns
  • Familiarity with data governance tools, especially Microsoft Purview
  • Solid programming expertise in Python, PySpark, or similar languages
Preferred Qualifications:


  • Healthcare industry experience, including claims, clinical, FHIR, HL7, or provider data
  • Experience with containerization (Docker, Kubernetes) for data workloads
  • Experience supporting machine learning workflows or analytical data science pipelines
  • Knowledge of distributed computing concepts and performance tuning

About the Company

V

Vytwo