Sr Engineer - Platform Monitoring
Cvent | Information Technology | Gurgaon, Haryana
Cvent, Inc. (www.cvent.com) is the world’s leading provider of cloud-based software for meetings and event management. Our platform of products includes software to manage and facilitate online event registration, meeting site selection, event management, e-mail marketing and web surveys. We also develop mobile apps for both corporate and consumer events. Founded in 1999, we currently have 3300+ talented and dedicated employees and are headquartered just outside of Washington, D.C., in McLean, Virginia, with additional U.S. offices in Portland, Oregon; Austin, Texas, and Los Angeles, California. Internationally we have offices in Gurgaon, India, and London, England. Cvent has received a number of awards and honors recognizing our strong company culture, innovative products, stellar customer service and support, visionary leadership and investment in our employees. We currently have job openings across all departments and locations and are looking to add valuable team members to further strengthen the company’s DNA.
Our Team and How You Fit:
The NOC team is responsible for identifying and remediating or escalating availability impacting events related to Cvent’s SAAS product and supporting infrastructure. We are the first line of defense against any service disruption. The Monitoring Platform Engineer supports this team by designing and implementing technical solutions to both specific monitoring use cases and to broad platform architecture and administration. As we continuously improve our monitoring posture away from symptoms based reactive alerting and towards improving visibility into business metrics, the Monitoring Platform Engineer will have direct impact on Cvent’s reputation and bottom line.
Principle Duties and Responsibilities:
- Continuously improve monitoring of our SAAS application, supporting technologies, and infrastructure
- Support service ownership of shared SAAS monitoring platforms through activities such as cost management, server maintenance, administration, and vendor relationship management
- Analyze false positive trends and improve signal to noise ratio
- Document monitoring best practices and train cross departmental engineering teams
- Support moving towards a monitoring-as-code business process by moving monitoring configuration into version control
- Build product visibility dashboards to improve awareness of important top level business metrics
Skills and Attributes:
This challenging role requires a varied skill set. The ideal candidate will possess many of the skills below and be confident in quickly developing the rest
- Expertise with at least one SAAS monitoring platform (New Relic, Datadog, LogicMonitor)
- Solid understanding of application and enterprise infrastructure architectures and systems, including application servers, datastores, networking and storage layers
- Comfortable with monitoring platform design, implementation and administration.
- Strong competency in Windows and Linux operating systems
- Experience with AWS technologies and cloud monitoring solutions
- Experience with microservice architectures
- Familiar with agile development and the software development lifecycle
- Solid scripting/development skills – Shell, Ruby, Python preferred
- Configuration management (Puppet, Chef, CFEngine, Ansible) - Chef preferred
- Ability to work independently as well as in a team environment
- Ability to communicate clearly, concisely and professionally
- Strong structural documentation skills.
- Project execution and delivery skills.
- On-call 24/7 support of owned services.
- 4-10 years relevant experience in NOC, operations, SRE, or devops roles supporting applications and infrastructure systems
- Mon-Fri with fixed week-offs
- No night shifts