Site Reliability Engineer Manager, Cloud Operations
Ping Identity | Infrastructure Operations (2103) | Denver, CO
At Ping Identity, we're changing the way people think about enterprise security technology. With our innovative Identity Defined Security platform, we're helping to build a borderless world where people have total freedom to work wherever and however they want. Without friction. Without fear.
We're headquartered in Denver, Colorado, and we have offices and employees around the globe. And we serve the largest, most demanding enterprises worldwide, including over half of the Fortune 100. Because even in the most complex enterprise environments, security shouldn't be a source of anxiety. It should be one of your greatest competitive advantages.
We call this digital freedom. And it's not just something we provide our customers. It's something that drives our company. People don't come here to join a culture that's build on digital freedom. They come to cultivate it.
As a Ping Identity SRE Manager, you will be the lead of a team of SREs involved in every facet of our On-Demand SaaS services. SREs are expected to provide input in the product's design, development, deployment, and operations. Everything needs to just work, all the time... and you’re the one who’s responsible.
This is a management position and is a largely independent one requiring initiative, individual responsibility, and a hands-on approach. You will be building a world class 7x24 systems and operations center.
- Participate as a technical expert providing solutions to operations problems.
- Contribute to new product implementation and provide upkeep and maintenance to existing products and services.
- Implement techniques for keeping systems available by designing simple, repeatable, and reliable solutions.
- Linux systems administration, configuration, troubleshooting and automation.
- Running and maintaining our production infrastructure hosted on AWS.
- Ownership of our 7x24 systems and security operations centers.
- Analysis of complex system behavior, performance and application issues.
- Development of monitoring solutions and analysis across multiple datacenters.
- Capacity analysis and planning, traffic routing, and security policies for Ping’s market leading Single Sign-On SaaS applications.
- Ownership for end-to-end service line at all levels
- This is an on-call position with a rotation schedule (for escalations and incidents).
- 5+ years in Operations and/or DevOps
- Proven track record for sustaining and growing high performance teams.
- 5+ years experience with Linux/UNIX systems administration.
- 2+ years Amazon Web Services (AWS)
- Solid experience with configuration management tools.
- Experience with monitoring tools (New Relic, Datadog, Zabbix).
- Experience with Apache, Tomcat, Cassandra, Kafka, and MySQL.
- IP networking, including familiarity with the functionality, operating, and failure modes of networks.
- Proven technical troubleshooting and performance tuning experience.
- Experience in a high-volume or critical production service environment.