Lead Site Reliability Engineer (AWS)
Cvent | Technology | Atlanta, GA
We are the world's leading provider of cloud-based software for meetings and event management. Companies use our SaaS platform to handle and facilitate online event registration, venue selection, budgeting and event management, website design, email marketing, day-of-event activities, social media integration, and much more. We build beautiful software that helps event planners take the event experience to their attendees via our responsive mobile web, HTML5, and native mobile apps, backed by a robust microservices architecture.
We are responsible for ensuring that our platform is stable and balanced. We break down barriers by cultivating developer ownership and empowering developers. We support them by building creative and robust solutions to operations problems. We use our background as generalists to work closely with product development teams from the early stages of design all the way through identifying and resolving production issues. We see the big picture. We help create and implement standards while facilitating an agile and learning culture. We use SRE principals such as blameless postmortems and operational load caps to ensure we’re constantly improving our knowledge and maintaining a good quality of life. Overall, we’re passionate about automation, learning and participating in dynamic day to day work.
Site Reliability is about combining development and operations knowledge and skills to help make the organization better. Whether you have a development background and are interested in learning more about operations or are a DevOps/Systems Engineer who is interested in developing internal tools – Cvent SRE can benefit from your skillsets. Ultimately, we are looking for passionate people who love learning and technology.
We use a wide variety of technologies and avoid getting locked into a single path. If we find something that works better than what we have, we always are open to trying it out. Here is a taste of the technologies you’ll get to work with.
- AWS (EC2 / ECS / Lambda / RDS / S3 / Route53 / DynamoDB)
- Java, .Net, Ruby
- Linux, Windows
- PostgreSQL, SQLServer
- Kafka / CouchBase / CouchMobile
- Chef, Puppet
- Terraform, CloudFormation
- Native iOS and Android
What You Will Be Doing
As a Site Reliability Engineer, you'll use your advanced development and operations knowledge to identify and prioritize issues. Find universal solutions to common problems and mentor and support junior staff. Additionally, you will:
- Enlighten, Enable and Empower a fast-growing set of multi-disciplinary teams, across multiple applications and locations.
- Pursue complex development, automation and business process problems.
- Champion Cvent standards and best practices.
- Ensure the scalability, performance, and resilience of our suite of products.
- Work with the development and product team of a new application to establish the right monitoring and alerting strategy.
- Work with a new acquisition's DevOps team to cross-pollinate standard methodologies, educate and close gaps in Cvent standards.
- Develop build, test and deployment automation that seamlessly targets multiple on-premises and AWS regions.
- Help a dev team working on a legacy code base to realize zero-down-time deployments.
- Give back by working on and contributing to Open Source projects: https://github.com/cvent
- Automate all the things!
What You Need for this Position
We believe that passion and willingness to learn, outweigh any list of skills, however having experience in some of the areas below would help you get going quickly and show that you can be successful as an SRE at Cvent.
- Object-Oriented Software development in Java, Scala, etc.
- CI Server administration and support (Jenkins)
- Configuration automation using Chef or Puppet.
- Building tools and scripting frameworks from scratch
- Solid Windows and Linux administration skills.
- Working with APM, monitoring, and logging tools (New Relic, DataDog, Splunk)
- Project management tools like Jira, Trello.
- NoSQL (etc., Couchbase, Cassandra).
- SQL databases (MSSQL, PostgreSQL, etc.).
- Message Queues (RabbitMQ).
- Scripting languages like Ruby, Groovy, Bash, PowerShell or Python.
- Bachelor's or Master's Degree in a technical field required