DevOps & Site Reliability Lead-Integration ETL

Q1 Technologies, Inc

Deerfield, IL

JOB DETAILS
SKILLS
Apache Hadoop, Apache Spark, Automation, Big Data, Capacity and Performance Management, Change Management, Communication Skills, Continuous Improvement, Cross-Functional, Customer Relations, Data Management, Data Processing, Database Extract Transform and Load (ETL), DevOps, Documentation, Ecosystems, High Availability, ITIL (IT Infrastructure Library), Identify Issues, Incident Management, Incident Response, Offshoring, On Call, Operational Support, Organizational Skills, Performance Management, Performance Tuning/Optimization, Problem Solving Skills, Production Support, Reliability Engineering, Root Cause Analysis, Time Management
LOCATION
Deerfield, IL
POSTED
8 days ago
Job Title: DevOps & Site Reliability Lead-Integration ETL
Duration- Fulltime Permanent
Location: Deerfield IL, 60015 (Onsite from Day1)
Job Description:
Must Have Technical/Functional Skills
  • We are seeking a Site Reliability Engineer (SRE) with strong expertise in Talend and Big Data platforms to support and operate large-scale data processing environments.
  • The role requires close collaboration with customers, application teams, and offshore delivery teams to ensure platform reliability, incident management, and operational excellence. Experience with Databricks is a strong plus.
Key Responsibilities
  • Act as an SRE for Big Data and ETL platforms, ensuring high availability, performance, and reliability of data pipelines and applications.
  • Provide operational support and incident management (MIM), including triage, root cause analysis, and resolution of production issues.
  • Serve as a primary point of contact for customers, providing timely updates, issue resolution, and operational insights.
  • Collaborate closely with application teams to support ETL jobs, data processing workflows, and platform enhancements.
  • Coordinate with offshore teams for day-to-day operations, incident resolution, and continuous improvement initiatives.
  • Monitor, troubleshoot, and optimize Talend, Hadoop, Spark, and Big Data ecosystems.
  • Implement and support monitoring, alerting, runbooks, and automation to improve platform stability and reduce manual effort.
  • Participate in problem management, change management, and post-incident reviews to drive preventive measures.
  • Support capacity planning, performance tuning, and reliability improvements across the data landscape.
Required Skills & Qualifications
  • Strong hands-on experience with Talend (development, support, and troubleshooting).
  • Solid understanding of Big Data technologies, including:
o Hadoop ecosystem
o Apache Spark
  • Proven experience handling Major Incident Management (MIM) and production support in a 24x7 or on-call environment.
  • Experience working directly with customers, business stakeholders, and cross-functional teams.
  • Strong coordination skills to manage and guide offshore teams.
  • Knowledge of ITIL processes, especially Incident, Problem, and Change Management.
  • Excellent communication, documentation, and stakeholder management skills.
Roles & Responsibilities
  • Act as an SRE for Big Data and ETL platforms, ensuring high availability, performance, and reliability of data pipelines and applications.
  • Provide operational support and incident management (MIM), including triage, root cause analysis, and resolution of production issues.
  • Serve as a primary point of contact for customers, providing timely updates, issue resolution, and operational insights.
  • Collaborate closely with application teams to support ETL jobs, data processing workflows, and platform enhancements.
  • Coordinate with offshore teams for day-to-day operations, incident resolution, and continuous improvement initiatives.
  • Monitor, troubleshoot, and optimize Talend, Hadoop, Spark, and Big Data ecosystems.
< li>Implement and support monitoring, alerting, runbooks, and automation to improve platform stability and reduce manual effort.
  • Participate in problem management, change management, and post-incident reviews to drive preventive measures.
  • Support capacity planning, performance tuning, and reliability improvements across the data landscape.

About the Company

Q

Q1 Technologies, Inc

Q1 consists of experienced and recognized experts providing the capability to respond to market demand in order to provide professional services for our clients including Enterprise software implementations, application integration and technical / functional support.

Q1 has steadily grown into a Quality IT services and solutions organization with the average experience of our team being over 10 years. We have continuously met or exceeded client expectations by delivering professional services and project implementations on time and under budget to help clients truly recognize return on investment.

COMPANY SIZE
500 to 999 employees
INDUSTRY
Computer/IT Services
FOUNDED
1990
WEBSITE
http://q1tech.com/