Monitoring & Observability Lead

 
Mid-level
🇮🇳 India
Technology

We are seeking a skilled and proactive DevOps Observability and Automation Lead to join our dynamic team. In this role, you will be responsible for enhancing our DevOps automation framework, applications' observability and reliability. You will collaborate closely with engineering, operations, and development teams to ensure our systems are highly available, scalable, and efficient.

Key Responsibilities:

  • Design and implement observability strategies, including monitoring, logging, and alerting solutions, to ensure high availability and performance of systems.
  • Lead efforts to automate infrastructure deployment, configuration management, and continuous integration/delivery pipelines.
  • Develop and maintain tools for deployment, monitoring, and operations, ensuring operational best practices are followed.
  • Extensive experience in shell scripting/programming, systems automation tools (Ansible(preferred)/Salt/Puppet/Chef/Kickstart/Terraform)
  • Must possess strong documentation skills and can work with rapid change and at a fast pace.
  • Excellent analytical, problem solving, and troubleshooting skills to manage complex process and technology issues.
  • Expertise in handling custom workflow design, automation and product license management
  • Collaborate with software development teams to integrate observability and automation into the software development lifecycle.
  • Analyze system performance and reliability metrics to identify and address bottlenecks and optimize performance.
  • Implement security best practices in observability and automation solutions.
  • Mentor and coach team members on best practices related to observability tools, automation frameworks and DevOps methodologies.

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or a related field (or equivalent experience).
  • Proven experience as a DevOps Engineer, Site Reliability Engineer, or similar role with a focus on observability and automation.
  • Hands-on experience with observability tools such as Prometheus, Grafana, Icinga, ELK stack, etc.
  • Proficiency in scripting and automation using Python, Shell scripting, or similar languages.
  • Strong understanding of containerization technologies (e.g., Docker, Kubernetes) and cloud platforms (e.g., AWS, Azure, GCP).
  • Experience with infrastructure-as-code tools such as Terraform, Ansible, or Chef.
  • Excellent troubleshooting and problem-solving skills.
  • Ability to work effectively in a fast-paced, dynamic environment.
  • Experience with CI/CD pipelines and related tools (e.g., Jenkins, GitLab CI).
  • Knowledge of agile methodologies and software development lifecycle.

Additional information

All your information will be kept confidential according to EEO guidelines.

 

Western Digital

Western Digital

Western Digital is a company that provides data-centric solutions, including storage devices and platforms for business and consumers.

Data Analytics
Hardware
Technology

Other jobs at Western Digital

 

 

 

 

 

 

 

 

View all Western Digital jobs

Why OmniJobs?

  • Rare & hidden jobs
  • New jobs every day
  • No expired job posts
  • All jobs in English

Receive emails about similar jobs

Get alerts to your inbox about new open jobs that are similar to this one.

🇮🇳 India
Technology

No spam. No ads. Unsubscribe anytime.

Similar jobs