Staff Site Reliability Engineer

Hybrid
Senior
🇮🇳 India
Site Reliability Engineer
Technology

What you get to do in this role:

  • For Early Engagement – Thoroughly assess new services and products, ensuring a seamless transition to cloud production while upholding the appropriate reliability standards to meet our Service Level Objectives (SLOs) and Service Level Agreements (SLAs).
  • For Early Engagement – Contribute to the development of systems that provide real-time insights into their operations while actively running.
  • For RCA/PRB - work with engineering stakeholders on Root Cause Analysis and critical problems.
  • For Cloud Improvements - Use knowledge and experience in software development, application support, systems engineering and networking to drive Cloud improvements.

Requirements

To be successful in this role you have:

  • Knowledge of Linux systems.
  • Comfortable designing, authoring, testing, and debugging code in a team setting in one of the following languages: Python, Go, JavaScript, or Ruby.
  • Experience working with Relational Database: MySQL, MariaDB or PostgresSQL.
  • Proficient in managing large-scale systems, with strong focus on automating processes, enhancing observability, ensuring high availability, and optimising performance for critical services.
  • Expertise in Observability and Monitoring of applications, services, and networks.
  • Experience with DevOps automation, CI/CD pipeline and agile methodologies such as Gitlab CI-CD.
  • Experience working with Cloud technologies such as Azure and AWS.
  • Experience in configuration management of infrastructure using Ansible.
  • Experience with Kubernetes to orchestrate the deployment, scaling, and management of containers.
  • Knowledge of core AI/ML techniques and algorithms.
  • Familiar with implementing Chaos engineering principles.
  • Experience in incident response process, post-mortem practices, or service best practice standards.
  • Review PCRs and suggest (or develop) additional measures for prevention.
  • Self-motivation to find how things are architecture (working) without prior knowledge or with limited prior knowledge (code deep dive, configuration deep dive).
  • Ability to build/develop ad-hoc tools/scripts to work around issues while waiting for upstream fixes.
  • ServiceNow platform knowledge is added advantage.

 

ServiceNow

ServiceNow

At ServiceNow, our technology makes the world work for everyone, and our people make it possible

Artificial Intelligence
Software

LinkedIn

🏭software development

Other jobs at ServiceNow

 

 

 

 

 

 

 

 

View all ServiceNow jobs

Notifications about similar jobs

Get notifications to your inbox about new jobs that are similar to this one.

🇮🇳 India
Site Reliability Engineer

No spam. No ads. Unsubscribe anytime.

Similar jobs