Site Reliability Engineer - Observability

Hybrid
Mid-level
🇮🇳 India
Site Reliability Engineer
Technology

We’re looking for problem solvers, innovators, and dreamers who are searching for anything but business as usual. Like us, you’re a high performer who’s an expert at your craft, constantly challenging the status quo. You value inclusivity and want to join a culture that empowers you to show up as your authentic self. You know that success hinges on commitment, that our differences make us stronger, and that the finish line is always sweeter when the whole team crosses together.

Site Reliability Engineer - Observability
Site Reliability Engineering is a new discipline at Alteryx where the team deploys, maintains, and operates Alertyx’s Cloud SaaS Products. The team works with Product Engineering, Infrastructure Engineering, SRE and the Customer service teams to ensure SaaS services are available and performant. This team will originate customer software/service fixes, contributing to various code bases, and ensuring product availability, scalability, and resiliency. In addition, team members will automate responses to alerts and program alert remediation. What you’ll do:

  • Deploy, Provision, and maintain Alteryx’s Cloud SaaS products.
  • Promotes our customer-centered approach to be a part of our customer’s solutions.
  • Work with product teams on non-functional requirements for SaaS products including resiliency, security, availability.
  • Improve and configure product observability/alerts and automate remediation through programming.
  • Serve as an escalation point for customer SaaS instances for availability, performance and application behavior issues.
  • Demonstrate critical thinking and growth mindset, enthusiastic about learning new technologies quickly and applying the gained knowledge to address customer problems
  • Author code fixes for software/service issues and work with Product Engineering teams to the merge code.
  • Participate in an on-call rotation for off-hours support on a periodic basis if required.

About you:

  • 2-4 years experience with modern observability stacks like DataDog. Prometheus, Grafana, Kibana, Thanos, Loki, TICK stack
  • 2-4 years experience designing, programming and/or operating distributed systems software
  • 2-4 years experience programming in python, go, nodejs, java, .NET or another modern programming language.
  • 1-2 years of experience with Kubernetes, OpenShift, k3s or another container orchestration technology.
  • Experience troubleshooting and problem solving skills related to containers or distributed systems.
  • Experience with CI/CD technologies like ArgoCD, Jenkins, or another CD.
  • Experience with AWS, GCP, Azure a plus.
  • Experience debugging software issues and performing RCAs.
  • Proficiency in Helm, Docker, and Jenkins.
  • Ability to break down and discuss technical issues and solutions with non-technical team members.

Find yourself checking a lot of these boxes but doubting whether you should apply? At Alteryx, we support a growth mindset for our associates through all stages of their careers. If you meet some of the requirements and you share our values, we encourage you to apply. As part of our ongoing commitment to a diverse, equitable, and inclusive workplace, we’re invested in building teams with a wide variety of backgrounds, identities, and experiences.

 

Trifacta Software India LLP

Trifacta Software India LLP

A company that values inclusivity and high performance, and empowers associates to show up as their authentic selves.

⚖️Peace and justice
Human Resources
Technology

Other jobs at Trifacta Software India LLP

 

 

 

 

 

 

 

 

View all Trifacta Software India LLP jobs

Notifications about similar jobs

Get notifications to your inbox about new jobs that are similar to this one.

🇮🇳 India
Site Reliability Engineer

No spam. No ads. Unsubscribe anytime.

Similar jobs