TEAM
Areneโs mission is to empower Toyota and its partners to deliver next-generation vehicles with hardware-agnostic software that can be updated to support new features at any time.
The Arene Site Reliability Engineering (SRE) Team is a sub-team of Arene Engineering Platform. The team's mission is to define, develop, and evangelize sound operational processes and tooling to ensure system stability, observability, and compliance with organizational standards to enable service owners to operate their services. We define the change management process for Arene products, assist teams with aligning to operational standards, and build tooling to empower developers to deploy and manage services at scale. We practice a culture of continuous improvement and blameless postmortems. You will report to the Head of the Arene Engineering Platform. This role's workplace is onsite in Japan, working in the office at least 3 days per week.
WHO ARE WE LOOKING FOR
You are a seasoned engineering leader, knowledgeable as a site reliability engineer such that you can guide a global SRE team which delivers operational excellence and automation.
RESPONSIBILITIES:
- Monitor operational health of services across Arene and triage incidents
- Work with our automotive and cloud software teams to improve the operational experience and system reliability
- Assist teams in defining service level Service Level Indicators (SLIs), Service Level Objectives (SLOs), and on call procedures
- Design, implement, and maintain tools related to Observability, Incident Management, and Deployments
- Promote operational best practices to teams across Arene
MINIMUM QUALIFICATIONS:
- 5+ years of people management experience leading high performing and global teams.
- Experience shipping and operating large-scale commercial-grade products.
- 5+ years of Software development experience in one or more languages (Python, Golang, Java, or similar).
- Strong Unix, Cloud, IaC and Build tooling skills necessary to dig deep in order to support SRE engineers and engineering teams.
- Strong competency with observability tools and best practices for large, distributed systems (ex. GCP Logging, Prometheus, Wavefront, Datadog, Pagerduty).
- Security knowledge and experience implementing security best practices.
- Ability to travel domestically and internationally to customer and team sites.
NICE TO HAVES:
- Experience communicating and training in SRE concepts, sufficient to educate and grow engineers
- Experience managing incidents within a global follow-the-sun organization.
- Experience troubleshooting and debugging complex globally distributed systems with a blend of on-premise and cloud components.
- Experience writing custom SRE tooling.
- Knowledge or experience with safety (ISO 26262, ISO 21448, IEC 61508) and security (ISO 21434) standards.
If you are located outside of Japan we will set up an interview over Google Hangout Meet.
ย
Woven by Toyota
Woven by Toyota is the mobility technology subsidiary of Toyota Motor Corporation. Our mission is to deliver safe, intelligent, human-centered mobility for all.
Other jobs at Woven by Toyota
ย
ย
ย
ย
ย
ย
ย
ย
Why OmniJobs?
- Rare & hidden jobs
- New jobs every day
- No expired job posts
- All jobs in English
Receive emails about similar jobs
Get alerts to your inbox about new open jobs that are similar to this one.
No spam. No ads. Unsubscribe anytime.
Similar jobs
ย
ย
ย
ย
ย
ย
ย
ย