Site Reliability Engineer

Mid-level
🇻🇳 Vietnam
Site Reliability Engineer
Technology

Come join the team that literally serves Grab - Grab’s Sentry team. The systems we oversee process billions of network messages for Grab's consumers every day, and our service orchestration and discovery platforms serve hundreds of Grab services without fail. We work closely with the infrastructure, security, and product teams, and are seeking talented engineers to join our platform team!

Your Role

You specialize in building, operating and maintaining leading edge solutions that Just Work™ . Built on world-class technology stacks, the software you develop brings our unique on-demand services experience to South-East Asia every day — be it transport, payments, food, or "the awesome things to come". Millions of people depend on the stability and efficiency your solutions provide, which is demanding in terms of design and quality but also incredibly rewarding.

You will

  • Build and own Grab's gateway systems, connecting millions of consumer devices with hundreds of backend services via Grab's service mesh..
  • Work at Grab's Viet Nam headquarters, closely integrated with software engineering and product teams to help build rock-solid and secure solutions with the right interface, technology and practices.
  • Be at home in a multi-cloud environment and build scalable, zero-downtime network traffic routing and analytics solutions for Grab's "public edge", serving billions of daily requests from millions of Grab customers and partners every day

Your Daily Routine

  • Independently drive projects across teams end to end, from inception to rollout
  • Find and troubleshoot issues in Grab's entire infrastructure and code base
  • Develop, maintain, and operate control- and data-plane components, and resolve production incidents
  • Implement quality solutions using Go and Lua and maintain the high bar of standards for code reviews and deployment processes. You mentor peers and promote development and operational excellence best practices while achieving excellent user experience

Requirements

Your Experience

You are a habitual problem solver, and naturally assume ownership of your team’s systems and software components. You know how to be responsible for mission-critical systems and

  • Have a very good understanding of TCP/IP, HTTP, Routing network and the internet
  • Experience/certification in AWS
  • Solid experience with automation & provisioning tools (e.g Jenkins/gitlab CI, Ansible/Chef//Puppet)
  • Strong experience in system troubleshooting in the Linux environment.
  • Strong experience in using service monitoring, log, and alarm-related environments and tools.
  • Know how to build highly-available distributed systems.
  • Are fluent in English
  • Are fluent in Bash, Python, Terraform or Go, and have an understanding of common patterns and algorithms to confidently navigate 3rd-party code bases for debugging and troubleshooting

Your Advantage

  • HTTP/2. QUIC, and gRPC expertise
  • Dealing with massive concurrency and designing resilient algorithms
  • Experience with building monitoring and alerting systems
  • Hands-on experience with Terraform and large scale Docker / Kubernetes deployments
  • Experience with Consul and/or Envoy (envoyproxy.io), their code base, and their community
  • Experience in a startup

 

Grab

Grab

Southeast Asia's leading super-app providing everyday services such as deliveries, mobility, financial services, enterprise services and others to millions of users across the region.

E-commerce
Logistics
Technology

Other jobs at Grab

 

 

 

 

 

 

 

 

View all Grab jobs

Notifications about similar jobs

Get notifications to your inbox about new jobs that are similar to this one.

🇻🇳 Vietnam
Site Reliability Engineer

No spam. No ads. Unsubscribe anytime.

Similar jobs