Senior Site Reliability Engineer

McGraw Hill


Overview

Build the Future

Are you looking for a career that makes a positive difference in your life and in the lives of learners and educators across the globe? McGraw Hill believes in inspiring educators and unlocking the potential in every student.

As a Senior Site Reliability Engineer, you will build and support highly available, fast, and cost-efficient systems in support of our mission to reimagine learning for students and teachers worldwide. The systems you help support will be used across K-12, Higher Ed and Professional segments. Millions of students rely on these apps and services each day to achieve their educational goals.

What will you do? 

  • Participate in initiatives involving system design and provisioning, reliability, observability and monitoring, self-service tool development, cost optimization, incident response, chaos engineering and build and release
  • Hands-on design, analysis, development and troubleshooting of large-scale distributed systems
  • Build, develop, and manage DevSecOps tools and automation to eliminate repetitive tasks, minimize downtime, achieve human free operations, and provide self-service solutions to product development teams
  • Work to improve the observability and monitoring of the systems we support.
  • Proactively monitor capacity, performance, and cost metrics to ensure quality and identify opportunities for cost savings and/or improvement
  • Share an on-call rotation with your team where you will respond to incidents, lead triage efforts, and conduct blameless postmortems
  • Partner with engineering, security, and product teams to keep our services reliable, available, fast, and cost efficient
  • Be a champion of the customer’s voice and ensure our solutions are built with customer empathy at the forefront
  • Promote SRE best practices within your team to ensure quality, stability, performance, resiliency, and maintainability of your solutions
  • Explore new technologies and solutions to push our capabilities forward

What can you bring to the role?

  • 5+ years’ combined experience as a Software Engineer, Site Reliability Engineer or DevOps Engineer
  • Proven technical abilities in the areas of reliability, monitoring, self-service tool development, incident response, and build and release
  • Experience in one of these languages: Python, Go or Java. Prior software development experience preferred
  • Strong experience with Linux environments
  • Demonstrated expertise designing, building, and triaging highly scaled production infrastructure in AWS
  • Experience with infrastructure automation technologies like Terraform
  • Experience in container/container-fleet-orchestration technologies like AWS ECS or EKS
  • Experience participating in a team’s 24×7 incident response efforts
  • Experience building ci/cd pipelines that are fast, informative, drive quality and achieve zero downtime releases
  • Ability to work across functional and domain boundaries to improve system reliability and deliver solutions on time and with quality
  • Common technologies in our ecosystem include:
  • Java, Go, Node, PHP
  • Linux, Windows
  • Apache Web, Nginx, IIS, Apache Tomcat, Jetty
  • Docker, AWS ECS and AWS EKS
  • ELB, CloudFront, S3, EC2s, RDS, IAM, SQS, SES, SNS, Lambda, API Gateway, Kinesis, Lambda, ElasticCache, ElasticSearch, SSM, Control Tower, and much more
  • MySQL, Oracle, PostgreSQL, SQL Server
  • Artifactory, GitHub Enterprise, GitHub Actions, CircleCI, Jenkins, SonarQube, Jfrog X-Ray, Control Tower
  • Terraform (preferred), CloudFormation
  • Packer, Puppet, Ansible
  • New Relic, CloudWatch, Datadog, PagerDuty

Bonus Points / Preferred:

  • Approach the job with an automation and software engineering mindset
  • Have a passion for uptime, observability, and full stack monitoring
  • Enjoy designing, deploying and managing automation tools
  • Be comfortable collaborating with Networking, Security and other DevOps teams
  • Ability to create and manage abstract IaC for effective reuse across the company.

Why work for us?

At McGraw Hill, you will be empowered to make a real impact on a global scale. Every day your individual efforts contribute to the lives of millions. As an education innovation company, we’re proud to play a part by inspiring learners around the world. If you bring your curiosity, we’ll help you grow in a collaborative environment where everyone shares a passion for success.

Are you ready for a new challenge? Apply for a career at McGraw Hill and together, we’ll impact the world.

45294

Read Full Description

Apply
To help us track our recruitment effort, please indicate in your cover/motivation letter where (vacanciesincanada.ca) you saw this job posting.