Job title: Site Reliability Engineer II
Job type: Permanent
Emp type: Full-time
Industry: Information Technology < IT >
Salary: Negotiable
Location: Tokyo
Job published: 2024-05-02
Job ID: 44316

Job Description

SRE Engineer 

 

■ Your Role and Responsibilities 

• Use your on-call shift to prevent incidents from happening.

• Document actions taken, so your findings turn into repeatable actions–and then into automation.

• Design, build and maintain core infrastructure pieces that enable to support hundreds of thousands of concurrent users.

• Debug production issues across services and levels of the stack.

• Mentor Interns and Intermediate SREs in all areas and other SRE in their area of deep knowledge.

• Contribute improvements to the codebase to resolve issues

• Identify significant projects that result in substantial cost savings or revenue

• Identify changes for the product architecture from the reliability, performance and availability perspective with a data driven approach.

• Proactively plan for efficiency and capacity to set clear requirements and reduce system resources usage to make the company assets cheaper to run for all our customers.

• Identify parts of the system that do not scale, apply immediate palliative measures and drive long term resolution of these incidents.

• Identify Service Level Indicators (SLIs) that will align the team to meet the availability and latency objectives.

• Know a domain really well and radiate that knowledge through recorded demos, discussions in DNA meetings, or Incident Reviews

• Perform and run blameless RCAs on incidents and outages aggressively looking for answers that can prevent the incident from ever happening again.

• Set example for team of SREs with positive and inclusive leadership and discussion on work.

• Be able to de-escalate conflicts inside the team

 

■ Work Location

・Tokyo, Japan 

 

■ Experience and Qualifications 

• 2+ years of work experience in the IT sector

• Experience as a Cloud, DevOps or Reliability engineer

• Work closely with engineering teams to create and improve containerized technologies

• Able to collaborate in a global team environment, actively engage subject matter experts, and follow through on commitments

• Strong problem solving (debugging) skills. The ability the dissect, divide and conquer platform problems and find root cause

• Knowledge of Microsoft Azure and / or AWS and / or GCP is a must (Azure preferred)

• Scripting knowledge in PowerShell and / or Python

• Version control experience (Git)

• Knowledge of container orchestration technologies (Kubernetes)

• Knowledge of container technologies (Docker)

• CI/CD knowledge

 

■ Additional Preferred Qualifications

• Knowledge of Azure Security center, policy and initiatives

• Knowledge of Azure Sentinel is a nice to have

• Experience operating in a Linux environment using the command line

• Experience using Azure DevOps or similar project management & pipeline tools

• Experience managing Logging and Monitoring system, knowledge of Azure Monitoring

 

File types (doc, docx, pdf, rtf, png, jpeg, jpg, bmp, jng, ppt, pptx, csv, gif) size up to 5MB
File types (doc, docx, pdf, rtf, png, jpeg, jpg, bmp, jng, ppt, pptx, csv, gif) size up to 5MB