Sr. SRE / DevOps Engineer - Sunnyvale, CA (Only Local candidate)

Donato Technologies Inc

Sunnyvale, CA

JOB DETAILS
SKILLS
Amazon Web Services (AWS), Analysis Skills, Ansible, Apache, Architectural Services, Automation, Budgeting, Communication Skills, Continuous Deployment/Delivery, Continuous Improvement, Continuous Integration, Copy Editing, Cross-Functional, DevOps, Docker, Emergency Response, Git, Identify Issues, Java, Linux Operating System, Mail Processing, Problem Solving Skills, Python Programming/Scripting Language, Reliability Engineering, Risk, Risk Analysis, Risk Management, Scripting (Scripting Languages), Snowflake Schema, Splunk, Strategic Planning, Systems Administration/Management, Systems Reliability, Team Player, Technical Recruiting
LOCATION
Sunnyvale, CA
POSTED
30+ days ago
Greetings from Donato Technologies Inc.
We have an immediate opening with my client. If you are looking for a new project, please send me a copy of your updated resumes

Title: Sr. SRE / DevOps Engineer
Location: Sunnyvale, CA (Only Local candidate)
Client Interview – In-Person


Job Summary –


For this role, we are looking for a Sr. SRE / DevOps Engineer at Sunnyvale, California location.


As Site Reliability Engineer, the individual will work closely with multi-functional teams, automate operations, optimize infrastructure, implement security and solve issues in an exciting, fast-paced environment. The individual will play a vital role in ensuring that the systems are reliable, scalable, and high performing.


Responsibilities –


•              Ensure system reliability and availability – Monitor system issues, create strategies to detect issues, address those issues, design automated systems to troubleshoot, write and review post-mortems.


•              Mitigate Operational risks - Collaborate with development teams and other stakeholders to identify potential risks, perform risk assessments, implement risk mitigation strategies, continuously monitor and review the effectiveness of risk strategies.


•              Monitor system health.


•              Minimize emergency response (MTTR).


•              Maintain CI/CD pipelines, etc.


•              Continuous improvement by collaborating with various teams.


•              Automation of processes.


Must have/required experience and skills:


•              8+ years of experience on DevOps and Site Reliability Engineering.


•              Hands-on with containerization and orchestration: Docker, Kubernetes/EKS.


•              Proficiency in infrastructure as code tools: Terraform, Ansible, or CloudFormation.


•              Experience setting up and managing services running on Kubernetes.


•              In-depth understanding of SRE principals including monitoring, alerting, error budgets, fault analysis, and automation.


•              In-depth knowledge of monitoring and observability tools: Apache Splunk


•              Knowledge of Linux operating system principles, networking fundamentals, and systems management


•              Demonstrable fluency in at least one of the following languages: Java or Python


•              Ability to identify and communicate technical and architectural problems, while working with partners and their team to iteratively find solutions.


•              Building and managing CI/CD pipeline – gatekeeping production deployments, develop and implement GIT branching strategies, branch protection rules, network policies, scale up/ scale down the load on AWS.


•              Strong problem-solving and analytical skills


•              Solve performance issues and scalability issues in the system.


Technical Skills:


•              DevOps and SRE


•              AWS Kubernetes/EKS, Docker


•              Terraform, Ansible, or CloudFormation


•              Apache Splunk, Apache Flink


•              Programming/Scripting using Java or Python


•              CI/CD


•              Database – Vertica, Snowflake.


Behavioral Skills:


•              ​Excellent Communication skills and collaboration skills


•              Ability to propose and implement improvements in the system


•              Ability to work with cross-functional stakeholders


•              Adaptability and a willingness to learn new technologies and techniques.


•              Proactive approach to issues, ability to provide prompt resolution/work




Jennifer Sampson


Technical Recruiter
.......................................................
DONATO TECHNOLOGIES, INC
12100 Ford Rd, #306, Dallas, TX 75234
Direct : (469)-342-0401
Email: jenny@donatotech.com
Web:  www.donatotech.net

About the Company

D

Donato Technologies Inc