Cloud Engineer (Monitoring, Automation & AIOPS)

I’m interested

Location

Leeds

Contract type

Full-time

Hours

37.5

Job description

About us

 

Our people are at the heart of everything that we do, and we offer a fast-paced environment where we have fun, celebrate success and give you all the tools that you will need to be successful in your role. It's not just our colleagues that we look after - we've got a responsibility to our customers too so we work hard towards our ambition that no-one is harmed by gambling.

 

What you’ll be doing

 

The role of Cloud Engineer in the Monitoring, Automation & AIOPS team is an exciting role within the organisation and one that doesn't arise very often. This is a great role and opportunity to be part of a small and dedicated team with exposure to all existing and emerging technologies.

 

You will be responsible for being one of the focal points of innovating and developing a class leading monitoring solution using event data and enriching with data sources from across technology in order for us to develop machine learning and predictive algorithms as part of our AIOPS strategy. The work will be varied and require knowledge of enterprise and opensource tooling along with a deep understanding of API's and integration methods.

 

  • Provide a centre of excellence around monitoring and tooling
  • Building the new. Be involved in creating exciting, new and cutting edge platforms in the cloud
  • Ability to be creative and provide innovative solutions to problems and take others with you
  • Develop Integrations to drive operational and business efficiencies and insights
  • Design, implement, and maintain developer focused modern logging and monitoring pipelines for a variety of applications deployed to both public and private cloud environments
  • Provide design guidance and operational support to development teams for the implementation of logging and monitoring within distributed cloud applications
  • Automation of monitoring deployment and maintenance
  • Manage and support various logging and monitoring solutions being the focal point
  • Develop best practice documentation to provide standardized logging, monitoring solutions and integrations
  • Engaging and directing any ‘Proof of Concept’ or prototyping of solutions to continually move us forward
  • Provide a solid understanding of Site Reliability and Chaos engineering principles to collaborate with wider teams
  • Ensure that security, operational and supportability guidelines are factored into the construction of any solution. This will include liaison with the relevant teams to ensure that they understand the implications of any proposed designs.

 

What we would like to see from you

 

  • Strong customer focus working to establish relationships with key stakeholders
  • Excellent knowledge of Amazon Web Services (AWS)
  • Excellent background in Infrastructure as code including management and testing infrastructure deployments
  • Experience in creating reusable cloud architecture and composable terraform modules
  • Experience with Docker containers and Kubernetes running production workloads.
  • Experience with Ansible Playbooks is desirable.
  • Experience architecting end to end cloud services for high performance API based applications.
  • Hands on programming skills with Infrastructure as code in Terraform and other languages such as Python
  • Solid monitoring / Logging experience with Prometheus, Grafana and other monitoring tools like New Relic and Splunk
  • Experience of working in an Agile development environment
  • Loves to share and collaborate with other team members.
  • A T-Shaped thinker (broad and deep). We want to see what you are capable of and what you can bring to our team.
  • Experience in Administration and Configuration of enterprise and open source tooling
  • An understanding of the management of different software version control repositories, specifically Gitlab and Jenkins.
  • Ideally holds a degree in an IT discipline or equivalent professional certification

 

Would you like to work on the evolution of our next generation real-time logging and analytics platform to leverage operational and business insights being built in the cloud? Our current platform handles over 8TB of data per day spanning over 2 billion events. All events are processed in real time and leverage machine learning across a resilient, evolvable and scalable architecture. The next iteration of our real time logging platform will leverage some leading edge concepts around machine learning (models as code), automation, introspective test frameworks and AIOPS. All hooked into our custom CI/CD pipelines to provide autonomy and enable agility at scale.

 

We are looking for someone with a solid coding background and experience of Splunk and AWS. If you are passionate about cloud and emerging technologies, monitoring, automation, machine learning, coding with an interest in creating custom platform & integrations then this role is for you.

  

William Hill in Leeds

 

Our City centre office is easy to reach, and surrounded by great places to shop, eat and socialise. Leeds itself is a busy place to work and lively. A hub for digital and gaming companies - there's also five universities right on our doorstep. And away from work, we're close to the Yorkshire Dales and of course have got a competitive sports scene with Leeds United and Yorkshire Cricket and Rugby just up the road in Headingley.