Jersey City, NJ30+ days ago
Demonstrates expertise in site reliability principles and demonstrates an understanding of the fine balance between features, efficiency, and stability • Effectively negotiates with peers and executive partners to ensure optimal outcomes for all • Drives the adoption of site reliability practices throughout the organization • Ensures your teams demonstrate site reliability best practices with the ability to demonstrate this empirically through stability and reliability metrics • Drives a culture of continual improvement and solicits real-time feedback to improve the customer's experience • Ensures your team collaborates with other teams within your group's specialization and avoids duplication of work where possible • Follows blameless, data-driven, post-mortem strategies and conducts regular team debriefs to enable learning from both successes and mistakes • Provides personalized coaching for entry to mid-level team members • Ensures your team documents and shares their knowledge and innovations via internal forums, communities of practice, guilds, and conferences. • Formal training or certification on software engineering concepts and 5+ years applied experience • Advanced proficiency in site reliability culture and principles and can demonstrate how to implement site reliability across application and platform teams while avoiding common pitfalls • Experience leading technologists to manage and solve complex technological issues at a firmwide level • Ability to influence the team's culture by championing innovation and change for success • Experience hiring, developing, and recognizing talent • Proficiency in at least one programming language (e.g., Python, Java Spring Boot, Net, etc.) • Demonstrated proficiency in software applications and technical processes within a technical discipline (e.g., cloud, artificial intelligence, machine learning, mobile, etc.) • Proficiency in continuous integration and continuous delivery tools (e.g., Jenkins, GitLab, Terraform, etc.) • Experience with container and container orchestration (e.g., ECS, Kubernetes, Docker, etc.) • Experience with troubleshooting common networking technologies and issues.