Site Reliability Engineer

Mid-level
Ho Chi Minh City, 🇻🇳 Vietnam
Site Reliability Engineer
Technology

Overview:

Momos is a rapidly growing company headquartered in Singapore and the United States. As part of our company's growth strategy, we are actively expanding our operations in the APAC region. Our main mission is to help our brands create happier customers at every location with AI.

Momos is the Customer Experience Management Platform for multi-location brands. We work with groups such as Shake Shack and Baskin Robbins to power the entire lifecycle and automate everything with AI. Today, we are proud to be trusted by over 5,000 businesses globally. If you love to hustle and want to work for a mission-driven company, we would be thrilled to have you join our team.

About the role:

We are seeking an experienced Site Reliability Engineer (SRE) with over 5 years of relevant experience to maintain the health, monitoring, automation, and scalability of our dynamic startup. The ideal candidate will have expertise in AWS, ECS, Redis, Terraform, Git, CodePipeline, Lambda, and Batch Jobs. Proficiency in Infrastructure as Code using Terraform, security tools like AWS Security Hub and Inspector, and CI/CD pipelines with AWS CodePipeline and GitHub workflows is essential. Strong hands-on experience with Docker, monitoring using CloudWatch and Datadog, and programming skills in Python and Bash are required. The candidate should excel in performance testing, troubleshooting, and have a solid understanding of networking concepts and protocols.The ideal candidate will be passionate about an operations role that involves deep knowledge of both the application and the product, and he/she will also believe that automation is a key component to operating large-scale systems.

Responsibilities:

  • Own the complete infrastructure comprising of containers, lambdas and various serverless as well as stateful components hosted on public cloud
  • Own all operational aspects of the various data management systems, including automation, monitoring, alerting, reliability and performance
  • Proactively identify sources of instability in the system and analyse how complex systems fail from a reliability and resilience perspective
  • Improve availability, reliability, and observability of our services and reduce the burden of human toil with tooling and automation
  • Build automation tools wherever required to avoid repetitive manual efforts
  • Architecture governance to set up the CICD pipeline.
  • Maintain and help us to evolve our cloud-based Infrastructure
  • Deployment, management and administration of containerised application and serverless frameworks deployed in a public cloud environment (AWS/GCP)
  • Incident management & driving operational excellence to prevent the big one
  • Operationalize learnings by collaborating with multiple engineering teams
  • Engage with product teams to diagnose and correct operational surprises
  • Have a direct impact on Momos business by identifying innovative solutions to operational challenges

Requirements

  • 5+ years of relevant experience in SRE.
  • Cloud and related stack: AWS, ECS, Redis, Terraform, Git, Code pipeline, config management, Lambda, Batch job, Step functions, Incident management, API Gateway, Networking, S3.
  • Infra As Code: Terraform.
  • CI/CD: AWS Codepipeline, Github workflows, Serverless Framework for lambda.
  • Good hands-on experience with Docker and related concepts.
  • Monitoring: Cloudwatch Logs, Cloudwatch Metrics, APM, Datadog, Dashboard creating on Cloudwatch as well as Datadog.
  • Version Control: Github.
  • Experience in conducting performance testing to evaluate the speed, scalability, and stability of an application or system under different conditions. Familiarity with open-source performance monitoring tools like JMeter etc...
  • Troubleshooting: Experience in investigating performance-related incidents, triaging performance issues, and performing root cause analysis to identify underlying causes.

Nice to have:

  • Programming and scripting skills: Python, and/or Bash.
  • Security: Security hub and AWS Inspector.
  • Familiarity with distributed and scalable logging, metrics, and alerting solutions.
  • Ability to ensure smooth software deployment by writing script updates and running diagnostics.

Location Hybrid in Ho Chi Minh City, Vietnam

Benefits

  • Competitive salary and bonus scheme
  • Private medical insurance
  • Paid time off and a flexible working culture
  • Opportunities for rapid career advancement
  • A dynamic and inclusive company culture
  • Access to the latest technology and tools for personal development
  • Comprehensive onboarding program for new employees
  • Employee recognition programs for outstanding performance
  • Participation in industry conferences and events
  • A supportive environment that encourages innovation and creativity

Cultural Values

  • Mission-driven and fast-paced, entrepreneurial environment.
  • A collaborative and flat company culture.
  • Comprehensive private health insurance.
  • Discretionary trips to our offices across the globe, with global travel medical insurance (when it’s safe to travel!).
  • Cross-cultural team bonding/networking.
  • Love Food? Join our Team!

Equal Opportunity

Momos is an equal-opportunity workplace where we embrace diversity and different cultures. We started as an international Company, and know that building an organization with different experiences, thoughts and opinions allows our team to grow and excel.

 

Momos

Momos

Momos is a rapidly growing company with its headquarters in Singapore and the United States

Advertising
B2B
E-commerce
Food and Beverage
Marketing
Restaurants
Software

LinkedIn

AI-native Customer Engagement for Multi-Location Brands

🏭Computer Software
🎂2021
107
17.4K

Updated  

Other jobs at Momos

 

 

 

 

 

 

 

 

View all Momos jobs

Why OmniJobs?

  • Rare & hidden jobs
  • New jobs every day
  • No expired job posts
  • All jobs in English

Receive emails about similar jobs

Get alerts to your inbox about new open jobs that are similar to this one.

🇻🇳 Vietnam
Site Reliability Engineer

No spam. No ads. Unsubscribe anytime.

Similar jobs