Sr. Engineer

TekWissen LLC

Frisco, TX

JOB DETAILS
SALARY
$67.32–$67.32
SKILLS
Agile Programming Methodologies, Amazon Web Services (AWS), Analysis Skills, Ansible, Apache Kafka, Application Programming Interface (API), Artificial Intelligence (AI), Artificial Intelligence (AI) Agents, Automation, Business Support, Cloud Computing, Computer Science, Computer Security, Continuous Deployment/Delivery, Continuous Improvement, Continuous Integration, Cost Control, Data Analysis, Data Management, DevOps, Disaster Recovery, Diversity, Docker, Documentation, GCP (Good Clinical Practices), High Availability, IDE (Integrated Development Environment), Identify Issues, Incident Management, Incident Response, Information Technology & Information Systems, Java, Jenkins, Machine Tool, Maintain Compliance, Mentoring, Metrics, Microservices, Microsoft Windows Azure, Microsoft Windows Server, Middleware, MySQL, Operating Systems, Operational Improvement, Oracle Data Integrator, Oracle Database, Oracle WebLogic Server, Privacy Regulations, Private Cloud, Production Systems, Public Cloud, Python Programming/Scripting Language, REST (Representational State Transfer), RabbitMQ, Reliability Engineering, Reporting Dashboards, Risk Management, Root Cause Analysis, SQL (Structured Query Language), Scripting (Scripting Languages), Snowflake Schema, Software Engineering, Software Patches, Splunk, Systems Engineering, Systems Maintenance, Systems Reliability, Systems Scalability, Technical Support, Unix Operating Systems, Unix Shell Programming, Use Cases, Windows Workflow Foundation (WF), Workforce Management, Writing Skills
LOCATION
Frisco, TX
POSTED
9 days ago
Overview:
TekWissen is a global workforce management provider headquartered in Ann Arbor, Michigan that offers strategic talent solutions to our clients world-wide. Our client provider of digital technology and transformation, information technology and services
Position: Sr. Engineer
Location: Frisco TX
Duration: 7 Months
Job Type: Temporary Assignment
Work Type: Onsite
Job Description:
  • Sr. Engineer, Systems Reliability - Privacy About the Role Senior Engineer, Systems Reliability (SRE) - Privacy ensures the stability, performance, and reliability of IT services and infrastructure.
  • This role combines software engineering and operations expertise to build and maintain highly available, scalable systems.
  • As a leader in DevOps and cloud reliability practices, the engineer supports continuous improvement of automation, deployment pipelines, observability, and incident management, while mentoring junior engineers and optimizing production workflows.
  • The position plays a critical part in enabling software to be delivered faster, better, and more reliably to support business and customer needs.
What You'll Do Build:
  • maintain CI/CD pipelines for data engineering deployments using GitLab and Azure DevOps Design and maintain CI/CD pipelines and DevOps automation solutions for REST APIs and microservices.
  • Implement robust monitoring, alerting, and logging for data pipelines, Snowflake and Azure services.
  • Respond to production incidents, troubleshoot failures and restore services quickly.
  • Perform root cause analysis and implement preventive measures.
  • Ensure high availability and disaster recovery planning for critical data systems.
  • Tune SQL queries, Snowflake features and Databricks clusters for optimal performance and cost efficiency.
  • Automate operational tasks to improve deployment reliability and reduce manual intervention.
  • Manage secrets and credentials using Azure Key Vault and CyberArk.
  • Hands-on experience with Terraform, Helm, or Ansible for infrastructure provisioning Working knowledge of containerization (Docker) and Kubernetes orchestration Hands-on experience with cloud platforms (Azure; AWS or GCP)
  • Understanding of deployment strategies (blue/green, rolling, canary), GitOps, and artifact management
  • Ensure compliance with data governance, privacy regulations and organizational security standards.
  • Work closely with data engineers, analysts and cloud teams to ensure smooth operations.
  • Maintain detailed runbooks, operational documentation and incident reports.
  • Perform regular OS patching on Unix and Windows servers to address security vulnerabilities and maintain system stability.
  • Apply critical and cumulative updates for middleware components such as Oracle Data Integrator (ODI), WebLogic and related software to mitigate risks and enhance performance.
  • Coordinate patching schedules with application and infrastructure teams to minimize downtime and ensure business continuity.
  • Use AI productivity tools daily (Claude and Cursor or similar IDE) across the SRE lifecycle including pipeline development, scripting, runbook authoring, log analysis, and incident response
  • Design, build, and operate AI agents to automate SRE tasks such as incident triage, root cause analysis, alert correlation, runbook execution, and patching workflows Apply foundation models, prompt engineering, and RAG patterns to operational use cases such as querying runbooks, summarizing incidents, and surfacing remediation guidance etc but not limited to these areas.
  • Implement audit logging, observability, and human-in-the-loop controls for AI agents and AI-assisted workflows operating in Tier-0 production environments Build and host AI agents, identify gaps and convert them into AI agent use cases, and implement solutions to further modernize the SRE platform
What You'll Bring:
  • Bachelor s degree in computer science, Engineering, or equivalent
  • practical experience 5-7 years of experience in systems reliability, software engineering, DevOps, or related technical roles
  • Experience working in Agile and DevOps delivery environments Demonstrated ability to mentor engineers and influence technical outcomes
  • Strong problem-solving skills with a systems-level perspective Strong automation, and agentic AI skills.
  • Familiarity with foundation models, prompt engineering, retrieval-augmented generation (RAG), and AI agent development applied to SRE and operational use cases Must Have Skills CI/CD tooling and automation experience (gitlab, azure devops, jenkins)
  • Experience working in public or private cloud environments Proficiency in one or more programming or scripting languages (Python, Java, Shell, etc.)
  • Experience with monitoring, logging, and APM tools such as AppDynamics, Splunk, or equivalents
  • Strong understanding of system reliability concepts including scalability, performance, availability, and resilience
  • Strong experience in writing SQLs, analyzing logs and troubleshooting issues.
  • Databases : SQL (Oracle/My SQL/ Snowflake) Messaging: Kafka, Rabbit MQ Hands-on experience with AI productivity tools (Claude and Cursor or similar IDE) and working knowledge of foundation models, prompt engineering, RAG, and AI agent development
  • Experience with containerization and orchestration technologies such as Docker and Kubernetes
  • Nice to Have Experience migrating systems to cloud-native architectures
  • Familiarity with reliability metrics, service monitoring, or operational dashboards
  • Exposure to platform engineering or shared services environments
TekWissen Group is an equal opportunity employer supporting workforce diversity.

About the Company

T

TekWissen LLC

WE THE TEKWISSEN PEOPLE

TekWissen offers you a broader portfolio of services, industry-leading solutions, and the meaningful innovations that give you greater flexibility and speed to respond to market dynamics, reduced costs and risk to improve enterprise performance, and increased productivity to enable growth.

To keep pace with global market demands, TekWissen keeps its finger on the pulse of change. Our organized approach to guiding a project from its inception to closure. Managing projects is becoming more and more important as we enter the digital era. To cope with the pace that this transition demands, a method is required to manage projects so they can yield quality work, while incorporating efficient use of time and resources.

Project involves identifying which quality standards are relevant to the project and determining how to satisfy them.

It is important to perform quality planning during the Planning Process and should be done alongside the other project planning processes because changes in the quality will likely require changes in the other planning processes, or the desired product quality may require a detailed risk analysis of an identified problem. It is important to remember that quality should be planned, designed, then built in, not added on after the fact.

Capabilities and accomplishments in one TekWissen business enhance the opportunity for success in the others. Put simply, TekWissen's unique combination of attributes promotes success.



COMPANY SIZE
100 to 499 employees
INDUSTRY
Computer/IT Services
FOUNDED
2009
WEBSITE
http://www.tekwissen.com/