Site Reliability Engineer

 
RemoteMid-level
🇮🇳 India
Site Reliability Engineer
Technology

Solvd Inc. is a premier software engineering company. We have 8 offices across the globe and over 800 international employees on staff. With over 12 years of experience, highly skilled teams around the world and deep industry knowledge, we help clients create software that improves their operations and opens new markets. We have built an impressive roster of digital-native enterprise clients including some of the biggest brands in retail and social media.

We are looking for a Site Reliability Engineer to join our growing team.

Responsibilities:

  • Collaborate with product, engineering, and operations teams to improve the reliability, scalability, and performance of our infrastructure and services.
  • Oversee the end-to-end management of production systems, ensuring high availability and rapid recovery from failures.
  • Develop and maintain Site Reliability Engineering (SRE) best practices through automation, monitoring, and alerting to minimize system downtime and manual intervention.
  • Create and manage infrastructure-as-code (IaC) layers, scripts, deployment frameworks, and tools that ensure efficient, scalable, and self-healing environments.
  • Collaborate closely with the software engineering team to design and implement robust monitoring and alerting systems, ensuring application reliability before and after going into production.
  • Perform incident response and root cause analysis for critical issues, working to eliminate the causes of outages or poor performance in production systems.
  • Take responsibility for the performance and scalability of AWS environments, ensuring they meet service level objectives (SLOs).
  • Provide expertise and insights during client meetings, helping solution architects address reliability and scalability questions related to product integration in client environments.
  • Maintain and update comprehensive documentation on system architecture, processes, and runbooks.
  • Engage in capacity planning, disaster recovery exercises, and postmortem reviews to ensure continual improvement in system resilience.

Requirements:

  • Bachelor's degree in Computer Science, Engineering, or a related field; or equivalent practical experience.
  • At least 5 years of professional experience in a Site Reliability Engineering (SRE), DevOps, or similar role.
  • Strong expertise in Amazon Web Services (AWS), with AWS certifications being a plus.
  • Proficiency in infrastructure-as-code (IaC) tools like CloudFormation or Terraform.
  • Skilled in at least one programming language such as Python, Java, or Go, with experience in scripting for automation and systems management.
  • Expertise in automating cloud-native technologies, deploying and scaling applications, and provisioning infrastructure across large environments.
  • Proven experience in building CI/CD pipelines and automating deployment processes, with tools like Jenkins, GitLab, or AWS CodePipeline.
  • Hands-on experience with containerization technologies, such as Docker and Kubernetes, for managing microservices-based architectures.
  • Deep understanding of Linux systems, networking, and security best practices.
  • Demonstrated ability to work with monitoring tools (e.g., Prometheus, Grafana) and troubleshoot live systems for better reliability.
  • Excellent communication skills, with the ability to collaborate with cross-functional teams and explain complex concepts to clients and non-technical stakeholders.

 

Solvd

Solvd Inc. is a premier software engineering company with over 12 years of experience and a global presence.

Software
Technology
Large Enterprise

Other jobs at Solvd

 

 

 

 

 

 

 

 

View all Solvd jobs

Why OmniJobs?

  • Rare & hidden jobs
  • New jobs every day
  • No expired job posts
  • All jobs in English

Receive emails about similar jobs

Get alerts to your inbox about new open jobs that are similar to this one.

🇮🇳 India
Site Reliability Engineer
Remote

No spam. No ads. Unsubscribe anytime.

Similar jobs